1 00:00:00,000 --> 00:00:00,120 2 00:00:00,120 --> 00:00:02,500 The following content is provided under a Creative 3 00:00:02,500 --> 00:00:03,910 Commons license. 4 00:00:03,910 --> 00:00:06,950 Your support will help MIT OpenCourseWare continue to 5 00:00:06,950 --> 00:00:10,600 offer high-quality educational resources for free. 6 00:00:10,600 --> 00:00:13,500 To make a donation or view additional materials from 7 00:00:13,500 --> 00:00:17,430 hundreds of MIT courses, visit MIT OpenCourseWare at 8 00:00:17,430 --> 00:00:18,680 ocw.mit.edu. 9 00:00:18,680 --> 00:00:23,960 10 00:00:23,960 --> 00:00:25,840 CHARLES LEISERSON: Today we're going to talk about dynamic 11 00:00:25,840 --> 00:00:31,070 storage allocation, which is a hugely rich and interesting 12 00:00:31,070 --> 00:00:36,110 topic, and hopefully cover some of the basics of dynamic 13 00:00:36,110 --> 00:00:38,090 storage allocation today. 14 00:00:38,090 --> 00:00:41,150 I want to start out with perhaps the simplest storage 15 00:00:41,150 --> 00:00:45,110 allocation scheme, and that's stack allocation, which I 16 00:00:45,110 --> 00:00:47,890 think you're probably fairly familiar with. 17 00:00:47,890 --> 00:00:52,620 So stack allocation is just an array and a pointer into the 18 00:00:52,620 --> 00:00:56,380 array, an index or a pointer into the array. 19 00:00:56,380 --> 00:00:59,890 And when you have a stack in a computer memory, you're 20 00:00:59,890 --> 00:01:02,900 talking about the array of all memory, and the stack pointer 21 00:01:02,900 --> 00:01:06,540 pointing that's used, for example, for calls. 22 00:01:06,540 --> 00:01:10,170 The reason stack allocation is neat is 23 00:01:10,170 --> 00:01:12,210 because it's so simple. 24 00:01:12,210 --> 00:01:16,130 If I want to allocate x bytes, then what I do, is I just 25 00:01:16,130 --> 00:01:24,030 simply increment my stack pointer by x bytes, and then I 26 00:01:24,030 --> 00:01:28,470 return the original position of the stack pointer as being 27 00:01:28,470 --> 00:01:30,705 the location of the storage that's just been allocated. 28 00:01:30,705 --> 00:01:33,210 29 00:01:33,210 --> 00:01:37,320 So pretty simple. 30 00:01:37,320 --> 00:01:41,260 Typically if you write a stack allocator, the routines are so 31 00:01:41,260 --> 00:01:44,500 simple they get inlined, although it's of course 32 00:01:44,500 --> 00:01:48,290 advantageous to say that things are inlined, but most 33 00:01:48,290 --> 00:01:50,460 compilers these days will inline. 34 00:01:50,460 --> 00:01:53,530 And as you can see, it's just a couple of operations here to 35 00:01:53,530 --> 00:01:55,934 perform this calculation. 36 00:01:55,934 --> 00:01:59,440 To deallocate the bytes, we basically-- 37 00:01:59,440 --> 00:02:01,430 oh, yeah, I should mention, this isn't 38 00:02:01,430 --> 00:02:03,820 checking for overflow. 39 00:02:03,820 --> 00:02:05,686 So technically, one should check for overflow. 40 00:02:05,686 --> 00:02:10,020 41 00:02:10,020 --> 00:02:13,880 One way to do that is for the assert package, and that way, 42 00:02:13,880 --> 00:02:16,750 at run time, you don't have to pay the 43 00:02:16,750 --> 00:02:17,930 extra overhead of checking. 44 00:02:17,930 --> 00:02:20,900 And see, these operations are so cheap that actually 45 00:02:20,900 --> 00:02:26,420 checking for overflow is perhaps as expensive as doing 46 00:02:26,420 --> 00:02:28,970 the rest of the operations themselves. 47 00:02:28,970 --> 00:02:31,740 Checking to see whether you went over the right end of the 48 00:02:31,740 --> 00:02:33,890 array here. 49 00:02:33,890 --> 00:02:38,820 So to free x bytes, you basically just simply 50 00:02:38,820 --> 00:02:41,930 decrement the stack pointer. 51 00:02:41,930 --> 00:02:43,900 And of course, technically, you should also check for 52 00:02:43,900 --> 00:02:49,120 underflow, make sure somebody didn't say to deallocate more 53 00:02:49,120 --> 00:02:51,190 bytes than were actually already 54 00:02:51,190 --> 00:02:52,440 allocated on the stack. 55 00:02:52,440 --> 00:02:55,030 56 00:02:55,030 --> 00:02:58,290 And so course, check for stack underflow. 57 00:02:58,290 --> 00:03:01,790 So this is pretty good, because allocating and freeing 58 00:03:01,790 --> 00:03:03,850 take just constant time. 59 00:03:03,850 --> 00:03:06,230 Very quick. 60 00:03:06,230 --> 00:03:10,990 But you're required to free memory consistent with the 61 00:03:10,990 --> 00:03:17,110 stack discipline, which is the last in, first out discipline. 62 00:03:17,110 --> 00:03:21,400 So you don't get to free something that you allocated 63 00:03:21,400 --> 00:03:25,460 long ago until you've freed everything else that's been 64 00:03:25,460 --> 00:03:27,960 allocated since. 65 00:03:27,960 --> 00:03:31,520 So therefore, it has limited applicability, but it's great 66 00:03:31,520 --> 00:03:33,160 when it works. 67 00:03:33,160 --> 00:03:38,010 Now it turns out, sometimes you can actually use the call 68 00:03:38,010 --> 00:03:46,070 stack yourself to store stack values, using a function 69 00:03:46,070 --> 00:03:47,570 called alloca. 70 00:03:47,570 --> 00:03:50,780 But this function is now deprecated, meaning that 71 00:03:50,780 --> 00:03:54,520 people advise you not to use it in any modern programming. 72 00:03:54,520 --> 00:03:58,510 And part of the reason is that the compiler is more efficient 73 00:03:58,510 --> 00:04:00,330 with fixed size stack frames. 74 00:04:00,330 --> 00:04:04,150 You remember when we went through how the compiler 75 00:04:04,150 --> 00:04:08,660 actually lays out memory and does the calling convention? 76 00:04:08,660 --> 00:04:12,590 If you have a fixed size stack frame, then the frame pointer 77 00:04:12,590 --> 00:04:14,880 and the stack pointer are redundant, and so you can 78 00:04:14,880 --> 00:04:18,930 reclaim a register, namely, the frame pointer register, 79 00:04:18,930 --> 00:04:20,769 and do everything off the stack pointer. 80 00:04:20,769 --> 00:04:23,550 But if you use things like alloca, which are pushing the 81 00:04:23,550 --> 00:04:26,670 sides of the stack frame, then you actually have to keep both 82 00:04:26,670 --> 00:04:29,270 pointers there, and that's a little bit less efficient in 83 00:04:29,270 --> 00:04:31,480 terms of the register usage. 84 00:04:31,480 --> 00:04:31,740 OK? 85 00:04:31,740 --> 00:04:33,370 So that's pretty much stack storage. 86 00:04:33,370 --> 00:04:35,780 Any questions about stacks? 87 00:04:35,780 --> 00:04:37,210 Yeah? 88 00:04:37,210 --> 00:04:39,347 AUDIENCE: Is there anything that detects overflow or 89 00:04:39,347 --> 00:04:41,010 underflow without using inserts? 90 00:04:41,010 --> 00:04:41,960 Like, if you wanted to check-- 91 00:04:41,960 --> 00:04:43,385 CHARLES LEISERSON: Yeah, you couldn't check explicitly. 92 00:04:43,385 --> 00:04:46,840 Just say, you know, for this, is before you check that, make 93 00:04:46,840 --> 00:04:51,451 sure that SP plus x, is it bigger than the right end? 94 00:04:51,451 --> 00:04:53,880 AUDIENCE: But how do you know where the right end is? 95 00:04:53,880 --> 00:04:55,510 CHARLES LEISERSON: Well, because I'm assuming here that 96 00:04:55,510 --> 00:04:57,920 in this case, you are allocating it. 97 00:04:57,920 --> 00:04:59,730 Now, if you're talking about using the 98 00:04:59,730 --> 00:05:01,650 call stack, you mean? 99 00:05:01,650 --> 00:05:02,840 You'll never run out. 100 00:05:02,840 --> 00:05:04,620 We'll do that analysis in a little bit. 101 00:05:04,620 --> 00:05:08,470 But you'll never run out of stack space. 102 00:05:08,470 --> 00:05:12,670 Although as a practical matter, most compilers assume 103 00:05:12,670 --> 00:05:21,420 a stack of a megabyte, and so if you overflow that, your 104 00:05:21,420 --> 00:05:24,140 program will fail in any of a variety of ways. 105 00:05:24,140 --> 00:05:25,530 AUDIENCE: You can do guard pages. 106 00:05:25,530 --> 00:05:27,460 CHARLES LEISERSON: You can do guard pages, which don't 107 00:05:27,460 --> 00:05:29,260 always work. 108 00:05:29,260 --> 00:05:31,800 But let me explain guard pages, OK? 109 00:05:31,800 --> 00:05:39,790 So what's sometimes done is, if I have my stack growing 110 00:05:39,790 --> 00:05:41,450 down in memory-- 111 00:05:41,450 --> 00:05:45,720 OK, so here's my stack, and it's growing this way-- 112 00:05:45,720 --> 00:05:47,750 and I want to make sure, for example, the thing that's 113 00:05:47,750 --> 00:05:51,200 typically growing up this way is heat, and I want to make 114 00:05:51,200 --> 00:05:54,290 sure that they don't collide somewhere in here. 115 00:05:54,290 --> 00:05:57,000 So one thing I can do is put some pages in 116 00:05:57,000 --> 00:06:00,480 here that are protected. 117 00:06:00,480 --> 00:06:03,430 These are read-only pages. 118 00:06:03,430 --> 00:06:06,335 Or actually even better is no permissions. 119 00:06:06,335 --> 00:06:09,450 120 00:06:09,450 --> 00:06:12,963 So I can put no permissions so that if you ever try to read 121 00:06:12,963 --> 00:06:14,400 or write that page, it 122 00:06:14,400 --> 00:06:17,600 automatically generates a fault. 123 00:06:17,600 --> 00:06:19,980 And so the runtime will sometimes do that. 124 00:06:19,980 --> 00:06:22,140 The difficulty is, what happens if you increased your 125 00:06:22,140 --> 00:06:25,590 stack so much that you went beyond your guard pages? 126 00:06:25,590 --> 00:06:28,590 But as a practical matter most of the times that people 127 00:06:28,590 --> 00:06:34,920 exceed the guard pages is only by small amount. 128 00:06:34,920 --> 00:06:37,530 And so just putting a few guard pages in there will make 129 00:06:37,530 --> 00:06:43,870 sure that you can use the memory management hardware, 130 00:06:43,870 --> 00:06:47,360 which basically ends up being free from your point of view 131 00:06:47,360 --> 00:06:50,180 as a programmer, in order to catch a 132 00:06:50,180 --> 00:06:52,490 stack overflow problem. 133 00:06:52,490 --> 00:06:53,770 And you can do that yourself. 134 00:06:53,770 --> 00:07:01,920 You can use the call mmap, which we haven't talked about, 135 00:07:01,920 --> 00:07:07,860 to set permissions on pages at user level. 136 00:07:07,860 --> 00:07:10,660 AUDIENCE: [INAUDIBLE] 137 00:07:10,660 --> 00:07:10,840 CHARLES LEISERSON: No. 138 00:07:10,840 --> 00:07:15,090 You can actually handle the error however you want. 139 00:07:15,090 --> 00:07:17,610 You can vector into a user to find routine. 140 00:07:17,610 --> 00:07:21,780 Now, what are you going to do, I don't know, but you can 141 00:07:21,780 --> 00:07:28,480 indeed catch the error yourself and decide that 142 00:07:28,480 --> 00:07:32,260 there's some way of recovering, as opposed to 143 00:07:32,260 --> 00:07:33,540 crashing and burning. 144 00:07:33,540 --> 00:07:38,320 So generally, it's considered poor form for your software to 145 00:07:38,320 --> 00:07:42,130 fail, other than in a controlled way. 146 00:07:42,130 --> 00:07:45,370 So you should always have your software exiting or calling 147 00:07:45,370 --> 00:07:50,310 error, and it should never segment fault. 148 00:07:50,310 --> 00:07:54,260 Most people consider any program that segfaults to be 149 00:07:54,260 --> 00:07:57,420 poorly engineered. 150 00:07:57,420 --> 00:07:59,590 Is there any case where there's a segfaulting program 151 00:07:59,590 --> 00:08:02,700 that is not poorly engineered? 152 00:08:02,700 --> 00:08:03,400 I don't know. 153 00:08:03,400 --> 00:08:08,450 I think most people, it's probably an understatement. 154 00:08:08,450 --> 00:08:09,700 OK? 155 00:08:09,700 --> 00:08:11,580 156 00:08:11,580 --> 00:08:12,660 So this is stack memory. 157 00:08:12,660 --> 00:08:18,150 You can build it yourself out of an array and a pointer, or 158 00:08:18,150 --> 00:08:21,820 you can I use this the call stack. 159 00:08:21,820 --> 00:08:25,920 And when you have something that's using that kind of 160 00:08:25,920 --> 00:08:27,770 discipline, it's great, because it's really cheap. 161 00:08:27,770 --> 00:08:31,040 162 00:08:31,040 --> 00:08:36,200 Now in C and C++, and we're going to start moving and 163 00:08:36,200 --> 00:08:40,690 talking more about things like C++, we have what's called 164 00:08:40,690 --> 00:08:42,970 heap allocation. 165 00:08:42,970 --> 00:08:47,780 And in C, the routines are called malloc and free, and 166 00:08:47,780 --> 00:08:52,750 C++ provides a new directive and delete. 167 00:08:52,750 --> 00:08:57,610 So unlike Java and Python, C and C++ 168 00:08:57,610 --> 00:09:01,050 provide no garbage collector. 169 00:09:01,050 --> 00:09:05,190 So in other words, you're responsible as a programmer 170 00:09:05,190 --> 00:09:08,490 for freeing allocated storage yourself. 171 00:09:08,490 --> 00:09:13,150 If you fail to do so, you create a memory leak. 172 00:09:13,150 --> 00:09:16,130 So a memory leak means I allocated something, I no 173 00:09:16,130 --> 00:09:19,650 longer am in a position to use it, but I failed to free it, 174 00:09:19,650 --> 00:09:22,530 so it can't be reused. 175 00:09:22,530 --> 00:09:24,650 And if you keep doing that, especially if you're in a 176 00:09:24,650 --> 00:09:27,080 loop, then you have a memory leak. 177 00:09:27,080 --> 00:09:30,120 Where if you look at a program with a memory leak, and you 178 00:09:30,120 --> 00:09:33,240 look using [? Top ?] or whatever, how big is my 179 00:09:33,240 --> 00:09:36,600 program, you watch as it goes and computes and computes, and 180 00:09:36,600 --> 00:09:38,960 it keeps getting bigger and bigger and bigger and bigger 181 00:09:38,960 --> 00:09:41,700 as it's running, and it just runs away, getting more and 182 00:09:41,700 --> 00:09:44,390 more storage. 183 00:09:44,390 --> 00:09:48,590 Also you have to watch out for things like dangling pointers. 184 00:09:48,590 --> 00:09:51,490 That is, you deallocated something, but somebody still 185 00:09:51,490 --> 00:09:54,590 wanted to use it. 186 00:09:54,590 --> 00:09:57,750 Or double-freeing. 187 00:09:57,750 --> 00:09:59,410 You have two places in the code where you 188 00:09:59,410 --> 00:10:02,720 free the same storage. 189 00:10:02,720 --> 00:10:07,310 Very bad to try to free the same storage. 190 00:10:07,310 --> 00:10:08,826 Yeah? 191 00:10:08,826 --> 00:10:10,076 AUDIENCE: [INAUDIBLE] 192 00:10:10,076 --> 00:10:12,040 193 00:10:12,040 --> 00:10:14,860 CHARLES LEISERSON: It's basically going to be sitting 194 00:10:14,860 --> 00:10:18,360 in the storage structure, which we'll talk about how 195 00:10:18,360 --> 00:10:20,515 those are organized in a few minutes-- 196 00:10:20,515 --> 00:10:22,970 going to be sitting in the storage structure, but the 197 00:10:22,970 --> 00:10:26,570 program that is going to look at that and not recognize that 198 00:10:26,570 --> 00:10:28,930 it's in the storage structure, and think that it's something 199 00:10:28,930 --> 00:10:31,370 that now all its pointers can be reset. 200 00:10:31,370 --> 00:10:33,950 And so you basically clobber all the invariants of the 201 00:10:33,950 --> 00:10:36,660 storage structure, and somewhere down the road, you 202 00:10:36,660 --> 00:10:38,950 get a segment fault or something. 203 00:10:38,950 --> 00:10:39,720 OK? 204 00:10:39,720 --> 00:10:40,762 Yeah, John? 205 00:10:40,762 --> 00:10:42,238 AUDIENCE: [UNINTELLIGIBLE] 206 00:10:42,238 --> 00:10:43,714 that comes out of that is-- 207 00:10:43,714 --> 00:10:44,206 CHARLES LEISERSON: Just a second. 208 00:10:44,206 --> 00:10:45,456 Why don't you take this? 209 00:10:45,456 --> 00:10:48,634 210 00:10:48,634 --> 00:10:49,884 AUDIENCE: [INAUDIBLE] 211 00:10:49,884 --> 00:10:57,500 212 00:10:57,500 --> 00:11:00,400 One thing that often happens in the real world is when you 213 00:11:00,400 --> 00:11:04,800 free something, it adds it to a free, list and if you free 214 00:11:04,800 --> 00:11:07,250 something twice, it will add two copies of that pointer to 215 00:11:07,250 --> 00:11:08,130 the free list. 216 00:11:08,130 --> 00:11:10,970 So later on, when you malloc, malloc might return the same 217 00:11:10,970 --> 00:11:12,400 pointer to you twice. 218 00:11:12,400 --> 00:11:15,990 So the same memory block might be allocated for two 219 00:11:15,990 --> 00:11:19,330 completely different functionalities, and attackers 220 00:11:19,330 --> 00:11:21,890 use that to do all sorts of interesting 221 00:11:21,890 --> 00:11:22,770 things to your program. 222 00:11:22,770 --> 00:11:24,108 CHARLES LEISERSON: Yeah. 223 00:11:24,108 --> 00:11:26,000 OK? 224 00:11:26,000 --> 00:11:29,930 So fortunately, there are some really good-- why don't you 225 00:11:29,930 --> 00:11:33,730 keep that down there, and then if you guys want to chime in, 226 00:11:33,730 --> 00:11:39,660 you'll have some power to do so. 227 00:11:39,660 --> 00:11:44,720 So fortunately, there are now some tools to make that easier 228 00:11:44,720 --> 00:11:48,260 to debug these kinds of really nasty bugs. 229 00:11:48,260 --> 00:11:52,080 The one we'll be using is Valgrind, and what you do is 230 00:11:52,080 --> 00:11:55,820 you just say, Valgrind minus minus leak check equals yes, 231 00:11:55,820 --> 00:11:58,950 and then run your program the way you would normally run it. 232 00:11:58,950 --> 00:12:03,270 And what it will do is, it will monitor the way that the 233 00:12:03,270 --> 00:12:09,850 program is doing allocations and frees, and give you a 234 00:12:09,850 --> 00:12:12,490 report as to whether everything that was allocated 235 00:12:12,490 --> 00:12:16,030 was free, and whether you tried to free something twice, 236 00:12:16,030 --> 00:12:17,820 and so forth. 237 00:12:17,820 --> 00:12:21,830 Now, this works with the existing allocators. 238 00:12:21,830 --> 00:12:24,980 If you have your own allocator that you decide you're going 239 00:12:24,980 --> 00:12:27,400 to write for a little piece of your code, you say, gee, I'm 240 00:12:27,400 --> 00:12:30,210 going to do my own little allocation here, you can 241 00:12:30,210 --> 00:12:34,280 actually communicate to Valgrind what that allocator 242 00:12:34,280 --> 00:12:36,870 is, and it can check for proper use of that 243 00:12:36,870 --> 00:12:38,670 structure as well. 244 00:12:38,670 --> 00:12:41,970 So you can see valgrind.org for details. 245 00:12:41,970 --> 00:12:47,360 And are a lot of other programs on the market, and 246 00:12:47,360 --> 00:12:51,450 out there as open software, such as purify and 247 00:12:51,450 --> 00:12:52,700 a variety of others. 248 00:12:52,700 --> 00:12:55,110 249 00:12:55,110 --> 00:12:59,100 So let's talk about heap allocation. 250 00:12:59,100 --> 00:13:01,770 I want to start by talking about fixed size allocation. 251 00:13:01,770 --> 00:13:05,400 So suppose the world has only one size of objects that you 252 00:13:05,400 --> 00:13:07,760 care about. 253 00:13:07,760 --> 00:13:12,710 Then it turns out that doing an allocator is really simple, 254 00:13:12,710 --> 00:13:15,640 and you could program one yourself. 255 00:13:15,640 --> 00:13:17,200 So here's basically how it works. 256 00:13:17,200 --> 00:13:19,275 It's called using a free list. 257 00:13:19,275 --> 00:13:25,830 And what you have is, in your array, say, that you're using 258 00:13:25,830 --> 00:13:31,500 for storing all of your values, and here they're 259 00:13:31,500 --> 00:13:35,260 basically this size, what you do is you have a certain 260 00:13:35,260 --> 00:13:36,400 number that are used. 261 00:13:36,400 --> 00:13:39,560 That means that the program is using them, 262 00:13:39,560 --> 00:13:41,165 not the storage allocator. 263 00:13:41,165 --> 00:13:44,520 And what you do is you take advantage of the fact that if 264 00:13:44,520 --> 00:13:48,440 it's an unused block of storage, then you can put a 265 00:13:48,440 --> 00:13:50,710 pointer in there. 266 00:13:50,710 --> 00:13:54,290 And so what you do, is you string all these unblocked 267 00:13:54,290 --> 00:13:56,810 pieces of storage together into a free list. 268 00:13:56,810 --> 00:13:59,960 So we point to there, which points to there, which points 269 00:13:59,960 --> 00:14:04,180 to there, which points to there, which points nowhere. 270 00:14:04,180 --> 00:14:06,530 So you just simply linked together all of the things 271 00:14:06,530 --> 00:14:09,540 that are free in that list. 272 00:14:09,540 --> 00:14:13,350 And then whenever you need one to allocate an object, what 273 00:14:13,350 --> 00:14:17,400 you do is, you grab one off the free list. 274 00:14:17,400 --> 00:14:23,290 So you set the value that you're going to want to take 275 00:14:23,290 --> 00:14:25,040 off free list to free-- 276 00:14:25,040 --> 00:14:28,540 so here x gets a pointer to this object, the one that's at 277 00:14:28,540 --> 00:14:30,740 the start of the list. 278 00:14:30,740 --> 00:14:37,690 Then you update free with free arrow next, which 279 00:14:37,690 --> 00:14:38,940 gives you this guy. 280 00:14:38,940 --> 00:14:41,170 281 00:14:41,170 --> 00:14:43,180 So that's the next guy in line. 282 00:14:43,180 --> 00:14:46,850 And then you basically return x. 283 00:14:46,850 --> 00:14:48,920 And that's all there is to it. 284 00:14:48,920 --> 00:14:53,550 You just grab the first element off the free list. 285 00:14:53,550 --> 00:14:56,070 Now, there's a little bit of subtlety in that the thing 286 00:14:56,070 --> 00:14:59,900 that you're grabbing off has a garbage pointer in it, because 287 00:14:59,900 --> 00:15:01,730 we're not initializing storage. 288 00:15:01,730 --> 00:15:03,970 There are some storage allocators that will always 289 00:15:03,970 --> 00:15:05,980 set the value that's returned to the user to 290 00:15:05,980 --> 00:15:08,210 all zeros, for example. 291 00:15:08,210 --> 00:15:11,890 But typically malloc doesn't do that. 292 00:15:11,890 --> 00:15:15,590 It basically leaves that as uninitialized storage. 293 00:15:15,590 --> 00:15:17,970 And so you can actually go and see, what were the values that 294 00:15:17,970 --> 00:15:21,030 used to be stored in there? 295 00:15:21,030 --> 00:15:25,010 And so basically, you can end up with a garbage pointer 296 00:15:25,010 --> 00:15:28,170 there, but you really don't care about that, because it's 297 00:15:28,170 --> 00:15:29,850 no longer on the list. 298 00:15:29,850 --> 00:15:32,750 So you can see, for example, if I then went and I-- 299 00:15:32,750 --> 00:15:36,060 300 00:15:36,060 --> 00:15:40,190 well, let's do this, to take a look at what happens when a 301 00:15:40,190 --> 00:15:41,490 free an item. 302 00:15:41,490 --> 00:15:46,790 So if I want to free an object x, I take some object x here. 303 00:15:46,790 --> 00:15:49,020 Let's say it's this guy that's being freed. 304 00:15:49,020 --> 00:15:57,620 What I do is, I make it point to the first 305 00:15:57,620 --> 00:16:00,820 element of the list. 306 00:16:00,820 --> 00:16:04,050 So basically, next equals free, So he 307 00:16:04,050 --> 00:16:05,680 basically points there. 308 00:16:05,680 --> 00:16:09,560 And then I say free equals x, meaning move the free pointer 309 00:16:09,560 --> 00:16:13,510 up to now point to the new beginning of the list. 310 00:16:13,510 --> 00:16:15,830 So we're basically pushing and popping things off the front 311 00:16:15,830 --> 00:16:16,250 of the list. 312 00:16:16,250 --> 00:16:18,640 It's just the stack discipline, except we're using 313 00:16:18,640 --> 00:16:23,060 a linked list, rather than using an array. 314 00:16:23,060 --> 00:16:33,880 So we push to free, and we pop to allocate. 315 00:16:33,880 --> 00:16:35,140 Now the tricky thing here-- 316 00:16:35,140 --> 00:16:38,390 you can sort of see what might go wrong if I then said, oh, 317 00:16:38,390 --> 00:16:41,910 let me free this thing again. 318 00:16:41,910 --> 00:16:44,860 Suppose you said, now that this is free, suppose you came 319 00:16:44,860 --> 00:16:47,540 in and say, oh, I'll free that thing again. 320 00:16:47,540 --> 00:16:51,500 Well, now you're going to go through the same code here, 321 00:16:51,500 --> 00:16:53,100 and you're going to see, you're going to completely 322 00:16:53,100 --> 00:16:55,070 screw up the free list. 323 00:16:55,070 --> 00:16:59,000 So that's why you don't want to do a double free. 324 00:16:59,000 --> 00:17:01,150 So you can see exactly what's going to go on in the free 325 00:17:01,150 --> 00:17:03,050 list implementation. 326 00:17:03,050 --> 00:17:04,780 So this is pretty good and pretty cheap. 327 00:17:04,780 --> 00:17:07,780 328 00:17:07,780 --> 00:17:10,069 Allocating and freeing take constant time. 329 00:17:10,069 --> 00:17:12,740 That's great, just as with stack pointer. 330 00:17:12,740 --> 00:17:15,240 It's got good temporal locality, and the reason is 331 00:17:15,240 --> 00:17:18,650 because you're tending to allocate the thing that you 332 00:17:18,650 --> 00:17:20,319 freed most recently. 333 00:17:20,319 --> 00:17:22,560 The free list operates as a last in, 334 00:17:22,560 --> 00:17:24,930 first out data structure. 335 00:17:24,930 --> 00:17:26,829 So from a temporal locality point of view, 336 00:17:26,829 --> 00:17:28,670 it's really very good. 337 00:17:28,670 --> 00:17:32,550 You're always using the most recently freed stuff whenever 338 00:17:32,550 --> 00:17:36,740 you allocate, and that means you have very good caching 339 00:17:36,740 --> 00:17:39,450 behavior from a temporal locality point of view. 340 00:17:39,450 --> 00:17:44,200 However, it has poor spatial locality, and the reason is 341 00:17:44,200 --> 00:17:49,760 due to what's called external fragmentation, where you get 342 00:17:49,760 --> 00:17:54,250 your storage distributed across virtual memory. 343 00:17:54,250 --> 00:17:56,750 So if you think about this, as you start allocating and 344 00:17:56,750 --> 00:17:59,570 freeing in different orders, the order of that free list 345 00:17:59,570 --> 00:18:02,450 gets very jumbled up. 346 00:18:02,450 --> 00:18:09,720 And as it increases, and you end up with jumping from one 347 00:18:09,720 --> 00:18:17,750 thing to another, you basically can end up causing 348 00:18:17,750 --> 00:18:21,740 you to use a lot of the page table, because you're 349 00:18:21,740 --> 00:18:25,090 basically using a huge piece of it, rather than-- and in 350 00:18:25,090 --> 00:18:28,400 particular, the translation look aside buffer 351 00:18:28,400 --> 00:18:30,430 can also be a problem. 352 00:18:30,430 --> 00:18:34,770 So the translation look aside buffer typically works on a 353 00:18:34,770 --> 00:18:36,140 page basis. 354 00:18:36,140 --> 00:18:39,630 And if you recently referenced things on a given page, that's 355 00:18:39,630 --> 00:18:42,340 very good for the TLB, because it's already got the 356 00:18:42,340 --> 00:18:46,360 translation, the virtual address mapping of that page 357 00:18:46,360 --> 00:18:48,120 dealt with. 358 00:18:48,120 --> 00:18:51,310 So it basically has poor spatial locality, because it's 359 00:18:51,310 --> 00:18:52,610 getting all jumbled up. 360 00:18:52,610 --> 00:18:55,920 Compared to stack allocation, which is always very nice in 361 00:18:55,920 --> 00:18:58,240 terms of its behavior, because you're going across 362 00:18:58,240 --> 00:19:01,980 consecutive locations in memory, which means you're 363 00:19:01,980 --> 00:19:05,960 always allocating and freeing from the same page, unless you 364 00:19:05,960 --> 00:19:09,170 trip over a page boundary. 365 00:19:09,170 --> 00:19:13,180 So here's some ways of mitigating external 366 00:19:13,180 --> 00:19:14,020 fragmentation. 367 00:19:14,020 --> 00:19:17,940 Of course, this makes the code for free list more 368 00:19:17,940 --> 00:19:19,230 complicated. 369 00:19:19,230 --> 00:19:23,500 And then you have to weigh, how much am I losing in having 370 00:19:23,500 --> 00:19:27,160 a more complicated allocator versus suffering the 371 00:19:27,160 --> 00:19:28,840 fragmentation may occur? 372 00:19:28,840 --> 00:19:29,943 Yes, question? 373 00:19:29,943 --> 00:19:32,358 AUDIENCE: You specificed before that it was fixed size. 374 00:19:32,358 --> 00:19:35,260 Is that why we don't specify anything about the size? 375 00:19:35,260 --> 00:19:35,990 CHARLES LEISERSON: That's right. 376 00:19:35,990 --> 00:19:38,780 We're just assuming that it's all whatever the unit size is. 377 00:19:38,780 --> 00:19:42,450 So this is really good when you've created your own data 378 00:19:42,450 --> 00:19:46,400 structure, and let's say you're doing nodes in a graph. 379 00:19:46,400 --> 00:19:50,490 You know exactly what elements you want in the graph, for 380 00:19:50,490 --> 00:19:52,200 each vertex. 381 00:19:52,200 --> 00:19:56,240 And now what you want to do is basically be able to give 382 00:19:56,240 --> 00:19:59,110 yourself a note of a graph, free a note of a graph, and 383 00:19:59,110 --> 00:20:00,365 manage that storage yourself. 384 00:20:00,365 --> 00:20:02,970 385 00:20:02,970 --> 00:20:04,067 Yes, question? 386 00:20:04,067 --> 00:20:05,061 AUDIENCE: [INAUDIBLE] 387 00:20:05,061 --> 00:20:08,043 allocated across virtual memory [INAUDIBLE] one 388 00:20:08,043 --> 00:20:11,030 application? 389 00:20:11,030 --> 00:20:13,040 CHARLES LEISERSON: Why is it allocated across virtual 390 00:20:13,040 --> 00:20:15,060 memory if it's all one application? 391 00:20:15,060 --> 00:20:19,260 Well, because the size that you may end up needing can end 392 00:20:19,260 --> 00:20:19,990 up being large. 393 00:20:19,990 --> 00:20:22,010 And I'll show you in just a minute how that happens. 394 00:20:22,010 --> 00:20:24,880 395 00:20:24,880 --> 00:20:27,920 So here's the idea. 396 00:20:27,920 --> 00:20:31,990 What you can do is keep a free list per disk page. 397 00:20:31,990 --> 00:20:37,030 So disk page, a traditional page is 4,000 bytes. 398 00:20:37,030 --> 00:20:39,150 These days they talk about super pages that are 399 00:20:39,150 --> 00:20:45,120 megabytes, and who knows what size pages various operating 400 00:20:45,120 --> 00:20:46,020 systems will support. 401 00:20:46,020 --> 00:20:50,450 I believe the clouds use 4k, as configured right now are 402 00:20:50,450 --> 00:20:54,260 using 4k, which leads to some strange anomalies where the 403 00:20:54,260 --> 00:21:00,704 TLB can address less stuff than the L3 cache. 404 00:21:00,704 --> 00:21:02,250 But so be it. 405 00:21:02,250 --> 00:21:05,630 So basically, they just keep a free list per disk page, and 406 00:21:05,630 --> 00:21:09,320 basically always allocate from the free list for the page 407 00:21:09,320 --> 00:21:11,730 that's fullest, the page that you've allocated the most of 408 00:21:11,730 --> 00:21:13,880 the stuff off of. 409 00:21:13,880 --> 00:21:17,390 Rather than for a page that's less empty. 410 00:21:17,390 --> 00:21:19,990 So what this means is that when you come to allocate, you 411 00:21:19,990 --> 00:21:22,120 have to figure out, which page should I do the 412 00:21:22,120 --> 00:21:23,670 allocation off of? 413 00:21:23,670 --> 00:21:26,490 But we'll have a free list on each page. 414 00:21:26,490 --> 00:21:28,530 Whenever you free of a block of storage, you have to free 415 00:21:28,530 --> 00:21:30,460 it to the page on which the block resides. 416 00:21:30,460 --> 00:21:33,040 Don't put into a free list of a different page. 417 00:21:33,040 --> 00:21:36,430 You put it into a free list all of that particular page. 418 00:21:36,430 --> 00:21:40,020 Now if a page becomes empty, in other words, there's only 419 00:21:40,020 --> 00:21:43,180 the free list on it, there's no actually used elements by 420 00:21:43,180 --> 00:21:46,290 the programmer, the virtual memory system 421 00:21:46,290 --> 00:21:47,550 will page it out. 422 00:21:47,550 --> 00:21:50,220 And since the programmer's never referencing things on 423 00:21:50,220 --> 00:21:53,400 that page, it basically costs you nothing. 424 00:21:53,400 --> 00:21:59,570 You end up not having to have an entry in the TLB, and you 425 00:21:59,570 --> 00:22:02,890 may have written to it, but you will end up allocating 426 00:22:02,890 --> 00:22:07,770 things preferentially to the more full pages. 427 00:22:07,770 --> 00:22:11,980 And the basic idea is, it's better to have the use of your 428 00:22:11,980 --> 00:22:15,720 pages balance 90/10 than 50/50. 429 00:22:15,720 --> 00:22:17,320 So this kind [? of fig ?] configuration 430 00:22:17,320 --> 00:22:20,160 is better than that. 431 00:22:20,160 --> 00:22:22,590 So why is that? 432 00:22:22,590 --> 00:22:27,450 If you have 90% of your storage on one page and only 433 00:22:27,450 --> 00:22:30,940 10% on another-- 434 00:22:30,940 --> 00:22:34,850 yeah, let's say we only of two pages in our system, and 90% 435 00:22:34,850 --> 00:22:39,210 of the objects are on one page, and 10% of the objects 436 00:22:39,210 --> 00:22:42,210 are on the other page, versus if they're 50-50. 437 00:22:42,210 --> 00:22:46,860 Then what you can do is look at the probability that two 438 00:22:46,860 --> 00:22:51,050 accesses are going to go to the same page. 439 00:22:51,050 --> 00:22:54,870 Because if you're going to the same page, then your paging 440 00:22:54,870 --> 00:22:58,060 hardware, which is very much like the cache, your TLB, 441 00:22:58,060 --> 00:23:00,330 which is very much like a cache-- 442 00:23:00,330 --> 00:23:03,010 whenever you get a hit on the same page, that's good, it 443 00:23:03,010 --> 00:23:04,190 costs you nothing. 444 00:23:04,190 --> 00:23:06,760 If you hit on a different page, that's bad. 445 00:23:06,760 --> 00:23:08,910 So what happens is, the probability that two hit the 446 00:23:08,910 --> 00:23:11,830 same page is, well, for the first one, it's, what's the 447 00:23:11,830 --> 00:23:14,930 probability that the first one was on this page, 448 00:23:14,930 --> 00:23:16,900 versus on the other? 449 00:23:16,900 --> 00:23:19,580 So it's going to be on the same page if they're both on 450 00:23:19,580 --> 00:23:22,230 this page or they're both on this page. 451 00:23:22,230 --> 00:23:27,330 And so that's going to be 0.9 times 0.9 plus 0.1 times 0.1. 452 00:23:27,330 --> 00:23:31,170 Whereas if it's 50/50, then the probability that you get 453 00:23:31,170 --> 00:23:36,820 hit on the same page is now this 0.5 times 0.5 plus 0.5 454 00:23:36,820 --> 00:23:39,460 times 0.5, which is only 50%. 455 00:23:39,460 --> 00:23:45,520 So we get 82% hit rate versus the 50% hit rate. 456 00:23:45,520 --> 00:23:48,200 So it's much more likely, if I'm stacking everything out in 457 00:23:48,200 --> 00:23:51,140 the extreme, it's better to have everything on one page 458 00:23:51,140 --> 00:23:52,890 and nothing on the other. 459 00:23:52,890 --> 00:23:55,640 Then I get 100% hit rate, because whenever I'm accessing 460 00:23:55,640 --> 00:24:00,750 something, I always am hitting on the page I had before. 461 00:24:00,750 --> 00:24:03,770 So this kind of heuristic helps it so that you're 462 00:24:03,770 --> 00:24:09,940 swinging your accesses to using fewer pages, and the 463 00:24:09,940 --> 00:24:13,300 fewer pages, the less likely it is that you're going to 464 00:24:13,300 --> 00:24:16,860 stress your virtual memory paging, disk swapping, you're 465 00:24:16,860 --> 00:24:18,840 going to stress your TLB and what have you. 466 00:24:18,840 --> 00:24:21,770 467 00:24:21,770 --> 00:24:23,660 Good for that? 468 00:24:23,660 --> 00:24:25,240 We're going to get deeper into this, so. 469 00:24:25,240 --> 00:24:30,130 470 00:24:30,130 --> 00:24:33,270 So that's basically fixed size allocation. 471 00:24:33,270 --> 00:24:35,230 It's really nice, because you can basically use a very 472 00:24:35,230 --> 00:24:37,220 simple free list. 473 00:24:37,220 --> 00:24:40,350 But of course what most people need and want is variable 474 00:24:40,350 --> 00:24:42,240 sized allocation. 475 00:24:42,240 --> 00:24:46,830 The most popular way of doing variable sized allocation is a 476 00:24:46,830 --> 00:24:50,020 scheme called binned free lists, and of that, then there 477 00:24:50,020 --> 00:24:51,680 are a zillion variations on it. 478 00:24:51,680 --> 00:24:55,180 But mostly, people use binned free lists. 479 00:24:55,180 --> 00:24:58,330 The idea is, we're going to leverage the efficiency of 480 00:24:58,330 --> 00:24:59,880 normal free lists. 481 00:24:59,880 --> 00:25:02,800 Normal free lists I can allocate and free in just one 482 00:25:02,800 --> 00:25:07,370 operation, in just a constant amount of time I make. 483 00:25:07,370 --> 00:25:09,700 And what we're going to do is accept some internal 484 00:25:09,700 --> 00:25:11,220 fragmentation. 485 00:25:11,220 --> 00:25:14,080 So by internal fragmentation I mean the following. 486 00:25:14,080 --> 00:25:19,130 When somebody asks for something of a given size, 487 00:25:19,130 --> 00:25:23,200 what we're going to do is actually give them something 488 00:25:23,200 --> 00:25:27,350 which is a little bit bigger, maybe. 489 00:25:27,350 --> 00:25:28,815 And so we'll waste some. 490 00:25:28,815 --> 00:25:32,460 And in particular, what we'll do is, we'll organize our 491 00:25:32,460 --> 00:25:36,550 system so that all the blocks that we ever give out are 492 00:25:36,550 --> 00:25:39,250 always a size the power of two. 493 00:25:39,250 --> 00:25:41,980 In practice, you don't always give out exactly a power of 494 00:25:41,980 --> 00:25:44,260 two, because then you get cache alignment problems. 495 00:25:44,260 --> 00:25:49,090 But this is the basic scheme, is to give out things are 496 00:25:49,090 --> 00:25:50,880 powers of two. 497 00:25:50,880 --> 00:25:53,930 And that way, if somebody asks for something that has 13 498 00:25:53,930 --> 00:25:57,270 bytes, you give them 16 bytes. 499 00:25:57,270 --> 00:26:00,780 For somebody who asked for 65 bytes, have to 500 00:26:00,780 --> 00:26:03,570 give them 128 bytes. 501 00:26:03,570 --> 00:26:07,630 So you might waste up to as much as a factor of two. 502 00:26:07,630 --> 00:26:11,590 But now, for a given that range of values, there's only 503 00:26:11,590 --> 00:26:15,000 a logarithmic number of different sizes that we have 504 00:26:15,000 --> 00:26:16,770 to give out, because we're only giving them out 505 00:26:16,770 --> 00:26:19,910 in powers of two. 506 00:26:19,910 --> 00:26:27,430 So here's the basic allocation scheme to allocate x bytes. 507 00:26:27,430 --> 00:26:32,660 So what I do is, if I'm allocating x bytes, what I do 508 00:26:32,660 --> 00:26:38,870 is, I look in the bin ceiling of log x, because that's where 509 00:26:38,870 --> 00:26:42,650 everything of size two to the ceiling of log x is, which is 510 00:26:42,650 --> 00:26:46,890 basically rounding x up to the next power of two. 511 00:26:46,890 --> 00:26:51,750 If it's non-empty, I just return a block as if it were 512 00:26:51,750 --> 00:26:54,030 an ordinary free list. 513 00:26:54,030 --> 00:26:56,350 And what I'll be doing is returning a block which is at 514 00:26:56,350 --> 00:26:59,380 most twice the size of the requested block. 515 00:26:59,380 --> 00:27:01,040 But that's generally OK. 516 00:27:01,040 --> 00:27:03,940 Somebody wants at least that many bytes. 517 00:27:03,940 --> 00:27:06,240 The fact that it's a few more bytes won't matter to them. 518 00:27:06,240 --> 00:27:08,800 They don't know that there's more bytes hidden there. 519 00:27:08,800 --> 00:27:11,720 They only take advantage of the fact that there are many 520 00:27:11,720 --> 00:27:15,760 bytes as I've asked for. 521 00:27:15,760 --> 00:27:20,190 Otherwise, suppose that that bin is empty. 522 00:27:20,190 --> 00:27:22,930 There's nothing in that bin at this time. 523 00:27:22,930 --> 00:27:28,250 Well, then what you do is you start looking to find a block 524 00:27:28,250 --> 00:27:32,430 in the next larger non-empty bin, and you split it up. 525 00:27:32,430 --> 00:27:34,180 So for example here, suppose that the 526 00:27:34,180 --> 00:27:36,820 request is x equals 3. 527 00:27:36,820 --> 00:27:40,705 I have to look in bin 2, which is blocks of size 4. 528 00:27:40,705 --> 00:27:42,960 Uh oh, bin 2 is empty. 529 00:27:42,960 --> 00:27:46,001 So what I do is, I start going down, looking in the next bin. 530 00:27:46,001 --> 00:27:48,190 Oops, that's empty too, that's empty-- 531 00:27:48,190 --> 00:27:51,960 and then finally, I get to one where there is a block in it. 532 00:27:51,960 --> 00:27:54,160 So what I do then is I dice this up. 533 00:27:54,160 --> 00:27:59,090 I cut it in half, and then cut that one in half, until I get 534 00:27:59,090 --> 00:28:02,490 a piece that's exactly the size I want to return. 535 00:28:02,490 --> 00:28:03,180 OK? 536 00:28:03,180 --> 00:28:05,960 And then I distribute all of these guys up to the other 537 00:28:05,960 --> 00:28:07,210 bins, like this. 538 00:28:07,210 --> 00:28:12,230 539 00:28:12,230 --> 00:28:15,570 So I return one of this size, and I populate these other 540 00:28:15,570 --> 00:28:22,490 guys with one block each of the appropriate size. 541 00:28:22,490 --> 00:28:24,910 And as you can see, they're basically growing in a 542 00:28:24,910 --> 00:28:27,630 geometric fashion. 543 00:28:27,630 --> 00:28:30,070 And then I distribute that bin. 544 00:28:30,070 --> 00:28:33,760 Now there's a caveat here. 545 00:28:33,760 --> 00:28:39,230 Suppose that no larger blocks exist in my storage structure. 546 00:28:39,230 --> 00:28:43,720 Well, then I have to ask the operating system for more 547 00:28:43,720 --> 00:28:50,860 bytes, to allocate x more bytes of virtual memory. 548 00:28:50,860 --> 00:28:53,345 Now in practice what you do is you don't actually ask for x, 549 00:28:53,345 --> 00:28:54,620 since you never ask the operating 550 00:28:54,620 --> 00:28:56,590 system for small values. 551 00:28:56,590 --> 00:28:59,300 You always ask the operating system for big values. 552 00:28:59,300 --> 00:29:01,170 Give me another page worth, or give me another 553 00:29:01,170 --> 00:29:03,110 five pages, or something. 554 00:29:03,110 --> 00:29:06,000 Because that's an expensive call to the operating system, 555 00:29:06,000 --> 00:29:09,130 to ask it to allocate more storage. 556 00:29:09,130 --> 00:29:10,570 Let me explain how that works. 557 00:29:10,570 --> 00:29:13,070 And this gets back to the question you had before, which 558 00:29:13,070 --> 00:29:16,460 is, where's the stuff coming from anyway? 559 00:29:16,460 --> 00:29:19,570 What the operating system is allocating is actually not 560 00:29:19,570 --> 00:29:24,620 physical pages, but rather parts of virtual memory. 561 00:29:24,620 --> 00:29:29,830 So we looked at this before, about how virtual memory is 562 00:29:29,830 --> 00:29:33,150 laid out for a typical program. 563 00:29:33,150 --> 00:29:35,790 So what we do is, on the very high addresses, we have the 564 00:29:35,790 --> 00:29:37,760 stack, and the stack grows downwards. 565 00:29:37,760 --> 00:29:39,480 This is the call stack. 566 00:29:39,480 --> 00:29:41,750 At the low address, we typically start out with 567 00:29:41,750 --> 00:29:42,950 what's called the text segment, 568 00:29:42,950 --> 00:29:44,200 which holds your code. 569 00:29:44,200 --> 00:29:46,320 570 00:29:46,320 --> 00:29:49,860 Then there's a region called the data segment, and this 571 00:29:49,860 --> 00:29:54,480 consists of data that needs to be initialized. 572 00:29:54,480 --> 00:29:58,320 So the compiler typically distinguishes between data 573 00:29:58,320 --> 00:30:01,160 that needs initialization and data which is going to be 574 00:30:01,160 --> 00:30:05,150 zero, because data that needs initialization, it writes into 575 00:30:05,150 --> 00:30:08,610 the binary executable file and stores that on disk. 576 00:30:08,610 --> 00:30:11,580 So when you load in your program, it just loads as part 577 00:30:11,580 --> 00:30:14,710 of the text here, it loads that data segment, saying what 578 00:30:14,710 --> 00:30:16,370 all of the storage is. 579 00:30:16,370 --> 00:30:19,390 The next segment is, for historical reasons, called the 580 00:30:19,390 --> 00:30:21,250 BSS segment. 581 00:30:21,250 --> 00:30:26,780 And what BSS consists of is all the variables that at the 582 00:30:26,780 --> 00:30:30,480 start of the program should be initialized to 0. 583 00:30:30,480 --> 00:30:34,700 It doesn't bother storing these in the file. 584 00:30:34,700 --> 00:30:35,500 Why not? 585 00:30:35,500 --> 00:30:38,520 Because it's easy enough to just bring it in, understand 586 00:30:38,520 --> 00:30:43,980 what that distance there is, and then do a memset of this 587 00:30:43,980 --> 00:30:48,170 whole region with zero, to zero it out, rather than 588 00:30:48,170 --> 00:30:50,185 trying to read in a whole bunch of zeroes. 589 00:30:50,185 --> 00:30:52,800 590 00:30:52,800 --> 00:30:55,790 An obvious kind of data compression. 591 00:30:55,790 --> 00:30:59,740 Then heap storage is allocated in the space 592 00:30:59,740 --> 00:31:02,920 above the BSS segment. 593 00:31:02,920 --> 00:31:05,700 So the idea is, there's a part of heap which has already been 594 00:31:05,700 --> 00:31:07,360 allocated stuff. 595 00:31:07,360 --> 00:31:10,280 And whenever you need more heap storage, you ask the 596 00:31:10,280 --> 00:31:13,400 operating system-- the operating system moves just as 597 00:31:13,400 --> 00:31:18,850 if it were a stack, moves the line for where heap is to 598 00:31:18,850 --> 00:31:22,260 allocate, to give you more pages of heap storage that you 599 00:31:22,260 --> 00:31:25,660 can now allocate out of. 600 00:31:25,660 --> 00:31:27,110 And what it's doing is allocating 601 00:31:27,110 --> 00:31:28,360 pieces of virtual memory. 602 00:31:28,360 --> 00:31:31,380 603 00:31:31,380 --> 00:31:37,630 Now, here's a good mind question, which I think gets 604 00:31:37,630 --> 00:31:40,570 directly to your point. 605 00:31:40,570 --> 00:31:42,070 So think about this. 606 00:31:42,070 --> 00:31:45,370 We have a 64-bit address space. 607 00:31:45,370 --> 00:31:49,480 If I try to write at the rate of 4 billion bytes per second, 608 00:31:49,480 --> 00:31:51,640 that's pretty fast, because that's the speed at which the 609 00:31:51,640 --> 00:31:52,590 registers get written. 610 00:31:52,590 --> 00:31:55,360 Not the speed that memory gets written. 611 00:31:55,360 --> 00:32:00,060 So really, it's something like 4 billion bytes per second. 612 00:32:00,060 --> 00:32:07,330 It takes over a century to write all of virtual memory. 613 00:32:07,330 --> 00:32:10,840 So as a practical matter, that means I never run 614 00:32:10,840 --> 00:32:12,600 out of stack space. 615 00:32:12,600 --> 00:32:16,370 I never run out of heap space. 616 00:32:16,370 --> 00:32:18,910 They're never going to cross, no matter how much I'm 617 00:32:18,910 --> 00:32:19,760 allocating. 618 00:32:19,760 --> 00:32:22,980 Even if I allocate on every cycle, there's not going to be 619 00:32:22,980 --> 00:32:24,780 a program-- unless you're going to set a program running 620 00:32:24,780 --> 00:32:28,670 for a millennium sort of thing, you're not 621 00:32:28,670 --> 00:32:29,550 going to see that. 622 00:32:29,550 --> 00:32:33,190 Most programs run for a few minutes. 623 00:32:33,190 --> 00:32:34,205 Some may run for days. 624 00:32:34,205 --> 00:32:35,980 Some may run for years. 625 00:32:35,980 --> 00:32:38,220 Very few do people write to run for a century. 626 00:32:38,220 --> 00:32:41,370 627 00:32:41,370 --> 00:32:44,920 So why not just allocate out of free virtual memory and 628 00:32:44,920 --> 00:32:45,850 never bother freeing? 629 00:32:45,850 --> 00:32:48,810 Why am I worried about freeing stuff all the time? 630 00:32:48,810 --> 00:32:50,080 Why not just allocate it? 631 00:32:50,080 --> 00:32:50,860 It's virtual memory. 632 00:32:50,860 --> 00:32:52,150 It's virtual. 633 00:32:52,150 --> 00:32:55,290 I just allocate more storage as I need it. 634 00:32:55,290 --> 00:32:56,090 Yeah? 635 00:32:56,090 --> 00:32:57,870 AUDIENCE: [INAUDIBLE] memory? 636 00:32:57,870 --> 00:32:59,410 CHARLES LEISERSON: So it is tied to real memory, and 637 00:32:59,410 --> 00:33:01,070 that's part of the answer. 638 00:33:01,070 --> 00:33:03,800 It is tied to real memory, because anything that you 639 00:33:03,800 --> 00:33:09,030 write must find itself a place on disk. 640 00:33:09,030 --> 00:33:11,110 Because it's going to be written to pages, and those 641 00:33:11,110 --> 00:33:14,220 pages are going to be paged and then written to what's 642 00:33:14,220 --> 00:33:17,680 called the swap space on disk, which allows you 643 00:33:17,680 --> 00:33:18,810 to put things there. 644 00:33:18,810 --> 00:33:23,320 Even so, though, writing to disk-- 645 00:33:23,320 --> 00:33:26,450 it's still not going to take very long. 646 00:33:26,450 --> 00:33:29,000 I mean, you're never going to be able to use up disk, 647 00:33:29,000 --> 00:33:31,580 because disk takes so long to write for. 648 00:33:31,580 --> 00:33:33,109 But that is part of the answer. 649 00:33:33,109 --> 00:33:38,268 AUDIENCE: So you need to basically use [INAUDIBLE] 650 00:33:38,268 --> 00:33:39,206 software? 651 00:33:39,206 --> 00:33:41,040 And so-- 652 00:33:41,040 --> 00:33:42,520 CHARLES LEISERSON: And the page table. 653 00:33:42,520 --> 00:33:44,560 AUDIENCE: [UNINTELLIGIBLE] 654 00:33:44,560 --> 00:33:45,260 on the disk. 655 00:33:45,260 --> 00:33:47,480 CHARLES LEISERSON: So the issue is, this is where the 656 00:33:47,480 --> 00:33:50,950 external fragmentation would be horrendous. 657 00:33:50,950 --> 00:33:53,850 If you just kept writing across memory, and you only 658 00:33:53,850 --> 00:33:56,450 had a couple of words written on every page that were 659 00:33:56,450 --> 00:34:01,950 actually used, the rest is just garbage, then what you're 660 00:34:01,950 --> 00:34:06,370 saying is that in order to access a given datum, I'm not 661 00:34:06,370 --> 00:34:09,659 going to have it in memory with it a lot of other stuff. 662 00:34:09,659 --> 00:34:11,380 What I'm going to have in memory with it is going to be 663 00:34:11,380 --> 00:34:15,500 all garbage, this stuff that I would have freed. 664 00:34:15,500 --> 00:34:18,100 And so therefore, you end up having your program 665 00:34:18,100 --> 00:34:22,870 distributed across a large region of virtual memory, and 666 00:34:22,870 --> 00:34:25,330 that means as we're doing the paging, et cetera, we're 667 00:34:25,330 --> 00:34:30,330 getting very poor page locality as 668 00:34:30,330 --> 00:34:31,909 I'm accessing things. 669 00:34:31,909 --> 00:34:35,409 And so I have a finite number of pages that I fit in memory, 670 00:34:35,409 --> 00:34:38,610 and now I may be blowing that out, and every access, rather 671 00:34:38,610 --> 00:34:40,520 than going to something nearby, is 672 00:34:40,520 --> 00:34:42,260 instead going to disk. 673 00:34:42,260 --> 00:34:46,770 And so that can cause disk thrashing, where every access 674 00:34:46,770 --> 00:34:48,780 causes a disk access. 675 00:34:48,780 --> 00:34:51,830 So how fast are disk accesses? 676 00:34:51,830 --> 00:34:53,080 Order of magnitude? 677 00:34:53,080 --> 00:34:56,896 678 00:34:56,896 --> 00:34:58,550 How fast are disk accesses? 679 00:34:58,550 --> 00:35:00,680 What's a disk access cost? 680 00:35:00,680 --> 00:35:01,930 Fast disk? 681 00:35:01,930 --> 00:35:04,656 682 00:35:04,656 --> 00:35:05,906 AUDIENCE: [INAUDIBLE] 683 00:35:05,906 --> 00:35:10,057 684 00:35:10,057 --> 00:35:14,054 CHARLES LEISERSON: No, it's slower than that. 685 00:35:14,054 --> 00:35:15,497 Disk access. 686 00:35:15,497 --> 00:35:18,470 So good memory access, DRAM memory is, like, 60 687 00:35:18,470 --> 00:35:22,070 nanoseconds for DRAM access. 688 00:35:22,070 --> 00:35:23,810 It's like 60 nanoseconds. 689 00:35:23,810 --> 00:35:28,460 So the processor is going somewhere around, these days, 690 00:35:28,460 --> 00:35:32,370 between 3 and 4 gigahertz. 691 00:35:32,370 --> 00:35:39,210 So memory is like 200 processor cycles away. 692 00:35:39,210 --> 00:35:41,950 How fast is disk? 693 00:35:41,950 --> 00:35:45,050 An ordinary disk, not a solid state device. 694 00:35:45,050 --> 00:35:46,300 Ordinary disk. 695 00:35:46,300 --> 00:35:48,720 696 00:35:48,720 --> 00:35:51,840 10 milliseconds is a good ballpark to give. 697 00:35:51,840 --> 00:35:53,040 That's a good number to know. 698 00:35:53,040 --> 00:35:57,590 10 milliseconds, not 10 microseconds, not 10 699 00:35:57,590 --> 00:35:58,780 nanoseconds. 700 00:35:58,780 --> 00:36:01,510 10 milliseconds. 701 00:36:01,510 --> 00:36:05,510 So you are, what? 702 00:36:05,510 --> 00:36:09,130 6 to 7 orders of magnitude slower if you go to disk. 703 00:36:09,130 --> 00:36:11,930 704 00:36:11,930 --> 00:36:13,840 Really slow. 705 00:36:13,840 --> 00:36:16,280 So one of the things in performance optimization-- 706 00:36:16,280 --> 00:36:18,630 we're focusing a lot on the CPU-- 707 00:36:18,630 --> 00:36:21,070 is often the case for database things. 708 00:36:21,070 --> 00:36:23,530 How do you minimize those disk accesses? 709 00:36:23,530 --> 00:36:26,640 And all the things that we're learning about, performance 710 00:36:26,640 --> 00:36:34,040 for CPUs, can be applied to disk accesses, as well. 711 00:36:34,040 --> 00:36:37,280 So the idea here is we want to avoid-- 712 00:36:37,280 --> 00:36:40,280 that's why we don't want to have our stuff strewn across a 713 00:36:40,280 --> 00:36:43,670 whole bunch of pages where we only have a fixed amount of 714 00:36:43,670 --> 00:36:44,980 internal memory, because then we'll end up 715 00:36:44,980 --> 00:36:46,640 paging all the time. 716 00:36:46,640 --> 00:36:50,500 Every access will cause us to do a disk access, because it 717 00:36:50,500 --> 00:36:51,820 won't be very likely that we're getting 718 00:36:51,820 --> 00:36:53,400 stuff on the same page. 719 00:36:53,400 --> 00:36:55,630 So getting back, for example, to the fixed allocation, 720 00:36:55,630 --> 00:37:00,040 that's why it's a good idea to try to keep those pages full, 721 00:37:00,040 --> 00:37:01,790 so that we have a small thing. 722 00:37:01,790 --> 00:37:06,480 So the goal of most storage allocators is therefore to use 723 00:37:06,480 --> 00:37:11,710 as little virtual memory as possible, and try to keep the 724 00:37:11,710 --> 00:37:14,130 used portions very compact. 725 00:37:14,130 --> 00:37:18,030 Because the more compact it is, the more likely it is that 726 00:37:18,030 --> 00:37:21,180 when you access one, you'll be able to access something 727 00:37:21,180 --> 00:37:25,360 nearby without causing a page fault. 728 00:37:25,360 --> 00:37:27,940 So that's the goal of almost all storage allocators. 729 00:37:27,940 --> 00:37:31,580 Try to keep things compact, try to keep things tight and 730 00:37:31,580 --> 00:37:35,050 in very little virtual memory space, even though virtual 731 00:37:35,050 --> 00:37:36,640 memory is huge. 732 00:37:36,640 --> 00:37:38,790 Now, that doesn't mean that everything has to be 733 00:37:38,790 --> 00:37:39,430 contiguous. 734 00:37:39,430 --> 00:37:42,650 For example, in this organization as I showed 735 00:37:42,650 --> 00:37:44,730 before, actually, I have it just on the previous slide, so 736 00:37:44,730 --> 00:37:47,360 why don't I go back there-- 737 00:37:47,360 --> 00:37:50,660 these are very far away in memory, but there's only a 738 00:37:50,660 --> 00:37:55,970 couple of them, and they're tight and local in the regions 739 00:37:55,970 --> 00:37:57,650 that they're operating in. 740 00:37:57,650 --> 00:38:00,160 What I don't want to do is have it where I have just 741 00:38:00,160 --> 00:38:01,560 little stuff. 742 00:38:01,560 --> 00:38:05,460 It's OK to go into the middle of virtual memory and have a 743 00:38:05,460 --> 00:38:07,760 whole bunch of consecutive pages full of data. 744 00:38:07,760 --> 00:38:09,360 That's fine. 745 00:38:09,360 --> 00:38:12,060 What's bad is having just a little bit of data, a little 746 00:38:12,060 --> 00:38:14,120 bit of data, and a little bit of data sort of 747 00:38:14,120 --> 00:38:15,290 sprinkled all over. 748 00:38:15,290 --> 00:38:16,680 That's bad. 749 00:38:16,680 --> 00:38:19,930 But as long as I can put a whole bunch of data nearby, it 750 00:38:19,930 --> 00:38:22,950 doesn't matter where in virtual memory I put it, 751 00:38:22,950 --> 00:38:24,200 particularly. 752 00:38:24,200 --> 00:38:26,100 753 00:38:26,100 --> 00:38:29,090 So the idea is, let's try to use as little virtual memory 754 00:38:29,090 --> 00:38:29,980 as possible. 755 00:38:29,980 --> 00:38:31,230 Any questions about that? 756 00:38:31,230 --> 00:38:36,180 757 00:38:36,180 --> 00:38:37,227 OK. 758 00:38:37,227 --> 00:38:38,660 So how do we do that? 759 00:38:38,660 --> 00:38:43,270 So the first thing is, binned free list turns out to do a 760 00:38:43,270 --> 00:38:49,170 really good job of using relatively little memory. 761 00:38:49,170 --> 00:38:55,930 So suppose that the user program is using, at most, m 762 00:38:55,930 --> 00:38:58,020 memory at any point during the execution. 763 00:38:58,020 --> 00:38:59,770 Maybe allocating, maybe freeing. 764 00:38:59,770 --> 00:39:02,330 But if I took the high watermark of everything he 765 00:39:02,330 --> 00:39:06,810 says that he needs, we call that m. 766 00:39:06,810 --> 00:39:11,440 Then it turns out that if you use a bin free list allocator, 767 00:39:11,440 --> 00:39:14,330 the amount of virtual memory consumed is at 768 00:39:14,330 --> 00:39:17,480 most m times log n. 769 00:39:17,480 --> 00:39:20,170 Order m times log n. 770 00:39:20,170 --> 00:39:21,160 And why is that? 771 00:39:21,160 --> 00:39:24,000 Here's kind of a hand-wavy proof sketch. 772 00:39:24,000 --> 00:39:28,370 So whenever I make a request for a block of size x, it may 773 00:39:28,370 --> 00:39:32,970 consume at most twice x storage 774 00:39:32,970 --> 00:39:34,710 that's actually in use. 775 00:39:34,710 --> 00:39:38,840 And so it turns out that the total amount of memory devoted 776 00:39:38,840 --> 00:39:43,540 to blocks of size 2 to the k is therefore at most 2m. 777 00:39:43,540 --> 00:39:48,430 So for any given size, I might, at any given point in 778 00:39:48,430 --> 00:39:53,140 time, have the program using all blocks of that given size. 779 00:39:53,140 --> 00:39:57,200 Let's say size 2 to the k. 780 00:39:57,200 --> 00:40:01,350 And since there are a logarithmic number of those 781 00:40:01,350 --> 00:40:09,690 bins, each bin of which could have a size at most m, the 782 00:40:09,690 --> 00:40:11,320 total is order m log m. 783 00:40:11,320 --> 00:40:13,980 784 00:40:13,980 --> 00:40:16,990 Now it turns out that there's actually a stronger property. 785 00:40:16,990 --> 00:40:21,550 In fact, you can show that binned free list is order one 786 00:40:21,550 --> 00:40:22,230 competitive. 787 00:40:22,230 --> 00:40:24,620 Who knows the term "competitive" in this sense? 788 00:40:24,620 --> 00:40:27,530 The algorithmic meaning of "competitive"? 789 00:40:27,530 --> 00:40:32,180 So what it means is, it's at most a constant factor slower 790 00:40:32,180 --> 00:40:35,850 than the optimal omniscient algorithm that could figure 791 00:40:35,850 --> 00:40:39,420 out exactly what's going on in every step. 792 00:40:39,420 --> 00:40:42,280 So you're within a constant factor competitive with the 793 00:40:42,280 --> 00:40:45,970 optimal allocator, assuming that the optimal allocator 794 00:40:45,970 --> 00:40:49,470 does not do what's called coalescing. 795 00:40:49,470 --> 00:40:52,170 If it simply is dividing up things, then it can subdivide, 796 00:40:52,170 --> 00:40:55,390 but it can never glue things back together again. 797 00:40:55,390 --> 00:40:58,150 It turns out that in fact, binned free list is really 798 00:40:58,150 --> 00:41:01,430 much better than that. 799 00:41:01,430 --> 00:41:04,830 So let's talk a bit about coalescing, because many, many 800 00:41:04,830 --> 00:41:06,375 memory allocators use coalescing. 801 00:41:06,375 --> 00:41:09,220 802 00:41:09,220 --> 00:41:14,250 So the idea is, I'm concerned about the case where I end up 803 00:41:14,250 --> 00:41:18,380 with a whole bunch of little small objects, and yet the 804 00:41:18,380 --> 00:41:21,110 allocations now come from big chunks. 805 00:41:21,110 --> 00:41:25,340 And so I have to go and use more of my virtual memory, or 806 00:41:25,340 --> 00:41:27,970 maybe I can glue together those little pieces into a big 807 00:41:27,970 --> 00:41:30,100 chunk, and not have to go off to memory. 808 00:41:30,100 --> 00:41:32,670 So that's what coalescing is all about. 809 00:41:32,670 --> 00:41:35,610 So the idea is, if two things are adjacent, let 810 00:41:35,610 --> 00:41:37,490 me glue them together. 811 00:41:37,490 --> 00:41:40,060 And there are really clever systems for doing that, 812 00:41:40,060 --> 00:41:44,450 including a very famous one called the buddy system. 813 00:41:44,450 --> 00:41:47,820 But using any of these mechanisms for coalescing, you 814 00:41:47,820 --> 00:41:50,730 always have to trade off between the simplicity of not 815 00:41:50,730 --> 00:41:52,690 coalescing. 816 00:41:52,690 --> 00:41:55,710 Not coalescing so stick it in the free list, I'm done. 817 00:41:55,710 --> 00:41:57,950 Coalescing is, stick it in the free list, oh, am I next to 818 00:41:57,950 --> 00:42:00,130 somebody I can merge with? 819 00:42:00,130 --> 00:42:01,400 Now let me merge. 820 00:42:01,400 --> 00:42:05,460 So you always have to weigh, is that really going to be an 821 00:42:05,460 --> 00:42:07,290 advantageous thing? 822 00:42:07,290 --> 00:42:10,730 It turns out there are no good theoretical bounds that I'm 823 00:42:10,730 --> 00:42:15,060 aware that prove the effectiveness of coalescing. 824 00:42:15,060 --> 00:42:17,740 People have not figured out how to analyze this, or what 825 00:42:17,740 --> 00:42:20,620 an appropriate model is for understanding whether 826 00:42:20,620 --> 00:42:22,940 coalescing is a good idea. 827 00:42:22,940 --> 00:42:26,090 However, it does seem to work in practice for some things, 828 00:42:26,090 --> 00:42:29,510 and I think the reason for that-- and this is just my own 829 00:42:29,510 --> 00:42:30,680 take on it-- 830 00:42:30,680 --> 00:42:34,090 is that storage tends to be deallocated as a stack. 831 00:42:34,090 --> 00:42:37,370 So you tend to go into a region of a program where you 832 00:42:37,370 --> 00:42:39,200 do a lot of heap allocation-- 833 00:42:39,200 --> 00:42:41,690 let's say I'm building up a graph, and I keep doing a 834 00:42:41,690 --> 00:42:42,570 whole bunch of mallocs. 835 00:42:42,570 --> 00:42:44,840 And when I'm done with the graph, what do I do? 836 00:42:44,840 --> 00:42:47,320 I free it all. 837 00:42:47,320 --> 00:42:52,250 So I tend to have these batch allocations and then 838 00:42:52,250 --> 00:42:58,130 deallocations, and they tend to work often as a stack, 839 00:42:58,130 --> 00:42:59,120 often in a batch. 840 00:42:59,120 --> 00:43:02,280 And that means that when I do that freeing, I can sometimes 841 00:43:02,280 --> 00:43:05,550 take advantage of that and do coalescing. 842 00:43:05,550 --> 00:43:14,130 In some systems, people have what are called memory pools, 843 00:43:14,130 --> 00:43:17,440 where you have a region where when you allocate out of it, 844 00:43:17,440 --> 00:43:20,640 you know you're allocating out of your own particular region 845 00:43:20,640 --> 00:43:24,220 specifically so that you can do coalescing of things in 846 00:43:24,220 --> 00:43:26,100 that region. 847 00:43:26,100 --> 00:43:28,760 And so you may have multiple allocators, and you'll call 848 00:43:28,760 --> 00:43:31,930 the allocator on the region, say, for a graph, because 849 00:43:31,930 --> 00:43:34,910 let's imagine you're creating two graphs, and one of them 850 00:43:34,910 --> 00:43:36,140 you're going to keep around, and the other 851 00:43:36,140 --> 00:43:37,060 you're going to dump. 852 00:43:37,060 --> 00:43:39,340 Well, if I create them at the same time, the storage gets 853 00:43:39,340 --> 00:43:43,120 all intermixed, I'll deallocate one graph. 854 00:43:43,120 --> 00:43:46,020 The other one will be, I'll have fragmentation all over. 855 00:43:46,020 --> 00:43:48,060 These little spots appeared where the other 856 00:43:48,060 --> 00:43:49,380 graph used to be-- 857 00:43:49,380 --> 00:43:50,380 not so good. 858 00:43:50,380 --> 00:43:53,990 But if I say, use this allocator for this graph, use 859 00:43:53,990 --> 00:43:59,310 that allocator for that graph, then when I'm done and I'm 860 00:43:59,310 --> 00:44:01,790 allocating those out of two different regions of memory, 861 00:44:01,790 --> 00:44:06,000 that when I'm done with the first graph and deallocate 862 00:44:06,000 --> 00:44:08,260 everything, now I've created a nice big piece 863 00:44:08,260 --> 00:44:10,650 of contiguous storage. 864 00:44:10,650 --> 00:44:16,610 So people will sometimes play with different memory pools 865 00:44:16,610 --> 00:44:23,550 for allocators, and C++ has a fairly rich set of libraries 866 00:44:23,550 --> 00:44:26,220 and so forth for that, to allow you to do 867 00:44:26,220 --> 00:44:27,470 that kind of thing. 868 00:44:27,470 --> 00:44:30,170 869 00:44:30,170 --> 00:44:35,800 So let's go on and talk about garbage collection, because 870 00:44:35,800 --> 00:44:38,040 that's something you've probably seen in Java and 871 00:44:38,040 --> 00:44:40,940 Python, and what's going on under the covers there. 872 00:44:40,940 --> 00:44:45,210 Because even though C and C++ do not have garbage 873 00:44:45,210 --> 00:44:46,730 collection, you can actually implement 874 00:44:46,730 --> 00:44:47,860 your own garbage collector. 875 00:44:47,860 --> 00:44:52,620 It's not that hard, except for some details. 876 00:44:52,620 --> 00:44:55,260 877 00:44:55,260 --> 00:44:59,840 It's only hard when you want it to perform well. 878 00:44:59,840 --> 00:45:02,180 So the idea is to free the programmer 879 00:45:02,180 --> 00:45:04,310 from freeing objects. 880 00:45:04,310 --> 00:45:06,490 The downside, by the way, is when you free the programmer 881 00:45:06,490 --> 00:45:08,200 from freeing objects, they tend to 882 00:45:08,200 --> 00:45:10,570 create lots of garbage. 883 00:45:10,570 --> 00:45:13,460 And so what happens in Java programs, for example, is 884 00:45:13,460 --> 00:45:15,120 you'll have these-- 885 00:45:15,120 --> 00:45:19,460 I remember once having a Java program where, when it was 886 00:45:19,460 --> 00:45:23,640 sitting, waiting for user input, it would periodically 887 00:45:23,640 --> 00:45:25,750 go, garbage collect. 888 00:45:25,750 --> 00:45:27,576 It was just sitting there, and then it 889 00:45:27,576 --> 00:45:29,050 would go, garbage collect. 890 00:45:29,050 --> 00:45:30,010 We weren't doing anything. 891 00:45:30,010 --> 00:45:33,460 It was just periodically, the way they had written the code, 892 00:45:33,460 --> 00:45:36,940 they were somehow generating garbage, doing allocations and 893 00:45:36,940 --> 00:45:40,540 frees, when you were sitting around doing nothing. 894 00:45:40,540 --> 00:45:43,510 And so it would basically use up all of the space and then 895 00:45:43,510 --> 00:45:45,530 do a garbage collection, use up all the space and do a 896 00:45:45,530 --> 00:45:46,260 garbage collection. 897 00:45:46,260 --> 00:45:48,840 Naturally that's kind of a slow system. 898 00:45:48,840 --> 00:45:51,520 And so even in something like Java, where you have a garbage 899 00:45:51,520 --> 00:45:53,770 collector, you do have to worry about whether you're 900 00:45:53,770 --> 00:45:57,200 creating garbage or not, and you can generate faster code 901 00:45:57,200 --> 00:46:00,820 by not creating as much garbage, or by finding ways of 902 00:46:00,820 --> 00:46:03,120 recycling the garbage yourself. 903 00:46:03,120 --> 00:46:06,580 So even in Java, it's worthwhile saying, give me a 904 00:46:06,580 --> 00:46:08,890 big chunk of memory, let me manage it myself, 905 00:46:08,890 --> 00:46:09,960 now let me free it. 906 00:46:09,960 --> 00:46:12,220 I don't have to free it, but when I'm done with it, I'll be 907 00:46:12,220 --> 00:46:12,670 done with it. 908 00:46:12,670 --> 00:46:15,170 But in the meantime, let me manage it myself so that I 909 00:46:15,170 --> 00:46:20,790 don't have it continually generating more garbage. 910 00:46:20,790 --> 00:46:23,790 So the garbage collector can be built in, or 911 00:46:23,790 --> 00:46:25,350 you can do it yourself. 912 00:46:25,350 --> 00:46:29,020 And mostly in this class, we do stuff ourselves, so we know 913 00:46:29,020 --> 00:46:30,290 how they work. 914 00:46:30,290 --> 00:46:32,840 So that even when you're doing it with built in, you know 915 00:46:32,840 --> 00:46:36,470 what's going on under the covers. 916 00:46:36,470 --> 00:46:38,450 So here's some terminology. 917 00:46:38,450 --> 00:46:41,660 The roots in garbage collection are the objects 918 00:46:41,660 --> 00:46:43,930 that are directly accessible by the program. 919 00:46:43,930 --> 00:46:46,770 And those typically are things like globals, stacked 920 00:46:46,770 --> 00:46:48,570 variables, and so forth. 921 00:46:48,570 --> 00:46:53,830 Things that the program can directly address by name. 922 00:46:53,830 --> 00:46:57,100 The live objects are reachable from the roots by following 923 00:46:57,100 --> 00:47:00,890 pointers, and the debt objects are the ones that the 924 00:47:00,890 --> 00:47:04,790 programmer can never get to again. 925 00:47:04,790 --> 00:47:08,180 And so what we're interested in doing in garbage collection 926 00:47:08,180 --> 00:47:12,610 is finding all the dead objects and recycling them so 927 00:47:12,610 --> 00:47:15,060 we can use them again, so that they don't end up taking up 928 00:47:15,060 --> 00:47:19,590 space in our virtual memory and slowing us down by making 929 00:47:19,590 --> 00:47:22,140 our virtual memory space continue to grow and grow and 930 00:47:22,140 --> 00:47:27,050 grow across many disk pages. 931 00:47:27,050 --> 00:47:33,220 So one of the questions, in order for one to be able to do 932 00:47:33,220 --> 00:47:36,560 garbage collection, is how do I know where 933 00:47:36,560 --> 00:47:39,080 pointers are in objects? 934 00:47:39,080 --> 00:47:41,200 How do I know what's accessible? 935 00:47:41,200 --> 00:47:44,590 And so typically, in order to do that, you have to have a 936 00:47:44,590 --> 00:47:48,730 very strong idea of where is a pointer and where is data? 937 00:47:48,730 --> 00:47:50,630 Because you don't want to follow data as if it's a 938 00:47:50,630 --> 00:47:54,220 pointer, you want to know which regions of a given block 939 00:47:54,220 --> 00:47:56,090 of memory are pointers. 940 00:47:56,090 --> 00:47:59,370 And so strong typing helps a lot with that, which is why 941 00:47:59,370 --> 00:48:02,540 mostly people do garbage collection in strongly typed 942 00:48:02,540 --> 00:48:05,590 languages, because then you can actually identify it. 943 00:48:05,590 --> 00:48:10,040 It also requires you to prohibit pointer arithmetic. 944 00:48:10,040 --> 00:48:12,280 So the ability to take a pointer and then index off 945 00:48:12,280 --> 00:48:16,000 that pointer means I can get to some piece of storage that 946 00:48:16,000 --> 00:48:18,140 I can't access directly. 947 00:48:18,140 --> 00:48:20,640 And so typically in these kinds of systems, pointer 948 00:48:20,640 --> 00:48:23,940 arithmetic is illegal. 949 00:48:23,940 --> 00:48:28,000 So you can do array indexing, as long as you treat the array 950 00:48:28,000 --> 00:48:29,310 as a single object. 951 00:48:29,310 --> 00:48:33,450 You can't treat an array as the component pieces. 952 00:48:33,450 --> 00:48:36,070 You have to treat it as a single object. 953 00:48:36,070 --> 00:48:37,310 So pointer arithmetic. 954 00:48:37,310 --> 00:48:39,910 Well, you folks, who's familiar with pointer 955 00:48:39,910 --> 00:48:41,420 arithmetic? 956 00:48:41,420 --> 00:48:43,320 At this point, almost all of you have done stuff with 957 00:48:43,320 --> 00:48:47,770 pointer arithmetic, if your code is any good. 958 00:48:47,770 --> 00:48:50,940 Almost all of you have done adding to pointers in order to 959 00:48:50,940 --> 00:48:54,020 move through an array or what have you as being somewhat 960 00:48:54,020 --> 00:48:57,370 cheaper than doing an array index every time. 961 00:48:57,370 --> 00:49:02,460 It saves you an extra memory reference, typically. 962 00:49:02,460 --> 00:49:05,300 So that's one of their restrictions when you want to 963 00:49:05,300 --> 00:49:08,530 do garbage collection in a general setting. 964 00:49:08,530 --> 00:49:11,380 Now probably the simplest and most useful one as a 965 00:49:11,380 --> 00:49:15,150 programmer to use for garbage collection is what's called 966 00:49:15,150 --> 00:49:16,680 reference counting. 967 00:49:16,680 --> 00:49:19,490 I know many, many programs where what you do is you do 968 00:49:19,490 --> 00:49:23,130 reference counting because you can. 969 00:49:23,130 --> 00:49:25,660 But as you'll see, it has a limitation. 970 00:49:25,660 --> 00:49:28,950 So the idea in reference counting is to keep a count of 971 00:49:28,950 --> 00:49:31,890 the number of pointers referencing each object. 972 00:49:31,890 --> 00:49:33,780 So here we have a whole bunch of different objects of 973 00:49:33,780 --> 00:49:36,630 different sizes, and whenever there's a pointer going to the 974 00:49:36,630 --> 00:49:39,353 object, I add one to the reference count shown in green 975 00:49:39,353 --> 00:49:41,070 at the top here. 976 00:49:41,070 --> 00:49:44,760 I hope I got the count right. 977 00:49:44,760 --> 00:49:50,820 Whenever the count drops to zero, you free the object. 978 00:49:50,820 --> 00:49:52,600 That means nobody's pointing to this object 979 00:49:52,600 --> 00:49:52,930 [UNINTELLIGIBLE]. 980 00:49:52,930 --> 00:49:53,930 Let's free it. 981 00:49:53,930 --> 00:49:56,470 So let's see how that goes. 982 00:49:56,470 --> 00:49:58,840 So suppose that, for example, that pointer got 983 00:49:58,840 --> 00:50:00,920 moved to over here. 984 00:50:00,920 --> 00:50:02,750 Went away from there, got moved to there. 985 00:50:02,750 --> 00:50:05,800 Well, now there are two fields that have to be updated. 986 00:50:05,800 --> 00:50:08,140 The one that we took the pointer away from and the one 987 00:50:08,140 --> 00:50:09,700 that we put it to. 988 00:50:09,700 --> 00:50:11,850 So we update those. 989 00:50:11,850 --> 00:50:13,410 Oops, that guy went to zero. 990 00:50:13,410 --> 00:50:16,210 That means this stuff is garbage. 991 00:50:16,210 --> 00:50:18,470 But I can't just immediately go and collect 992 00:50:18,470 --> 00:50:19,490 that piece of storage. 993 00:50:19,490 --> 00:50:20,740 Why not? 994 00:50:20,740 --> 00:50:22,956 995 00:50:22,956 --> 00:50:25,932 AUDIENCE: To save the value of the pointer, and [INAUDIBLE] 996 00:50:25,932 --> 00:50:26,924 pointer later? 997 00:50:26,924 --> 00:50:27,970 CHARLES LEISERSON: Yeah, to save the 998 00:50:27,970 --> 00:50:29,230 value of the pointer-- 999 00:50:29,230 --> 00:50:31,015 AUDIENCE: To save the value of the pointer and then use it 1000 00:50:31,015 --> 00:50:32,152 again later. 1001 00:50:32,152 --> 00:50:35,074 The pointer isn't there, but you know where-- 1002 00:50:35,074 --> 00:50:37,022 CHARLES LEISERSON: Not quite, not quite. 1003 00:50:37,022 --> 00:50:37,509 Yes? 1004 00:50:37,509 --> 00:50:38,970 AUDIENCE: [INAUDIBLE] 1005 00:50:38,970 --> 00:50:41,405 the reference counts for [INAUDIBLE], and also-- 1006 00:50:41,405 --> 00:50:41,892 CHARLES LEISERSON: Yeah. 1007 00:50:41,892 --> 00:50:45,490 If you're going to deallocate this guy, he's got pointers. 1008 00:50:45,490 --> 00:50:48,440 They have to follow his pointers and see which ones 1009 00:50:48,440 --> 00:50:50,690 need to be deallocated. 1010 00:50:50,690 --> 00:50:52,080 Good. 1011 00:50:52,080 --> 00:50:54,610 So this guy now, we realize, is garbage. 1012 00:50:54,610 --> 00:50:59,000 But before we can just free him, I've got to deallocate 1013 00:50:59,000 --> 00:51:01,410 these pointers. 1014 00:51:01,410 --> 00:51:03,510 So this guy also got decremented, but he didn't. 1015 00:51:03,510 --> 00:51:08,150 So this guy, we have to decrement these two fields, 1016 00:51:08,150 --> 00:51:10,150 because we took away those pointers because we're going 1017 00:51:10,150 --> 00:51:12,050 to garbage collect that. 1018 00:51:12,050 --> 00:51:14,380 And when we do that, it turns out this thing turns to 1019 00:51:14,380 --> 00:51:17,310 garbage, as well. 1020 00:51:17,310 --> 00:51:19,960 So when you do reference counting, you have to make 1021 00:51:19,960 --> 00:51:23,672 sure that when you decrement things and decide you're going 1022 00:51:23,672 --> 00:51:26,740 to free something, you also then go through and free the 1023 00:51:26,740 --> 00:51:29,380 things, and you continue that process until everything that 1024 00:51:29,380 --> 00:51:35,880 has been deallocated can be deallocated. 1025 00:51:35,880 --> 00:51:39,340 And then this becomes recyclable, put back into the 1026 00:51:39,340 --> 00:51:43,310 storage, into your free list or whatever scheme you have. 1027 00:51:43,310 --> 00:51:48,130 So this is a very simple way, when you're implementing data 1028 00:51:48,130 --> 00:51:50,070 structures, of building your own dynamic storage. 1029 00:51:50,070 --> 00:51:52,930 Because you basically can allocate stuff, you can free 1030 00:51:52,930 --> 00:51:58,230 it, and whenever the reference count goes to 0, boom. 1031 00:51:58,230 --> 00:52:01,890 You just have it deallocated. 1032 00:52:01,890 --> 00:52:02,480 OK? 1033 00:52:02,480 --> 00:52:05,590 But you can program that yourself [INAUDIBLE]. 1034 00:52:05,590 --> 00:52:07,170 Now there's a problem with the scheme, though. 1035 00:52:07,170 --> 00:52:09,710 What's the problem? 1036 00:52:09,710 --> 00:52:11,170 Cycles. 1037 00:52:11,170 --> 00:52:14,090 Yeah, cycles are the problem for reference counting. 1038 00:52:14,090 --> 00:52:16,040 So here's an example. 1039 00:52:16,040 --> 00:52:18,890 The problem is that a cycle is never garbage collected. 1040 00:52:18,890 --> 00:52:22,030 Let's take a look at this structure here. 1041 00:52:22,030 --> 00:52:25,640 So everybody's got a pointer from somewhere, but it turns 1042 00:52:25,640 --> 00:52:33,950 out that there's a cycle here where, notice that they are 1043 00:52:33,950 --> 00:52:36,040 pointing to each other, we even have a guy coming off 1044 00:52:36,040 --> 00:52:39,870 here, but nothing points into the cycle. 1045 00:52:39,870 --> 00:52:42,760 So there's no way that, from the roots, I can ever access 1046 00:52:42,760 --> 00:52:44,980 these four guys. 1047 00:52:44,980 --> 00:52:47,790 They're garbage, but my reference counting scheme will 1048 00:52:47,790 --> 00:52:55,220 never find it, because no one's ever going to decrement 1049 00:52:55,220 --> 00:52:57,810 one of those counters here. 1050 00:52:57,810 --> 00:53:01,060 So those are all garbage, but how would I know? 1051 00:53:01,060 --> 00:53:03,550 And that's basically the disadvantage 1052 00:53:03,550 --> 00:53:08,420 of reference counting. 1053 00:53:08,420 --> 00:53:13,860 And uncollected garbage, as we all know, stinks. 1054 00:53:13,860 --> 00:53:17,630 So we don't want that situation to occur. 1055 00:53:17,630 --> 00:53:21,860 So nevertheless, reference counting is great if you've 1056 00:53:21,860 --> 00:53:24,650 got an acyclic structure. 1057 00:53:24,650 --> 00:53:28,070 And the number of times acyclic structures come up 1058 00:53:28,070 --> 00:53:30,830 where reference counting is the simple and easy way to do 1059 00:53:30,830 --> 00:53:35,540 the storage management is huge when you get into doing any 1060 00:53:35,540 --> 00:53:37,530 kind of interesting programming. 1061 00:53:37,530 --> 00:53:39,220 Lots and lots of opportunity to use 1062 00:53:39,220 --> 00:53:41,840 reference counting approach. 1063 00:53:41,840 --> 00:53:43,420 So questions about reference counting? 1064 00:53:43,420 --> 00:53:47,230 1065 00:53:47,230 --> 00:53:48,130 You have a question? 1066 00:53:48,130 --> 00:53:48,809 Yeah? 1067 00:53:48,809 --> 00:53:50,246 AUDIENCE: So this means you actually have to do this 1068 00:53:50,246 --> 00:53:51,683 [UNINTELLIGIBLE] 1069 00:53:51,683 --> 00:53:53,120 ever do a [INAUDIBLE]? 1070 00:53:53,120 --> 00:53:55,040 You can, like, save the work-- 1071 00:53:55,040 --> 00:53:55,650 CHARLES LEISERSON: That's right. 1072 00:53:55,650 --> 00:53:58,380 Whenever you move a pointer, if you're going to move a 1073 00:53:58,380 --> 00:54:02,180 pointer, you first decrement the counter on the place 1074 00:54:02,180 --> 00:54:04,820 you're moving it from, and then garbage collect if need 1075 00:54:04,820 --> 00:54:07,740 be, and then you move the pointer-- 1076 00:54:07,740 --> 00:54:09,930 that's typically what you do-- you move the pointer and 1077 00:54:09,930 --> 00:54:16,390 increment the count at that location, which typically 1078 00:54:16,390 --> 00:54:19,790 doesn't require any extra work to decrement, which is the 1079 00:54:19,790 --> 00:54:20,680 trick thing. 1080 00:54:20,680 --> 00:54:20,960 Sorry? 1081 00:54:20,960 --> 00:54:22,722 AUDIENCE: There's no good way to do this in idle time. 1082 00:54:22,722 --> 00:54:26,180 Like I'll have some idle time later, let me do it-- 1083 00:54:26,180 --> 00:54:28,520 CHARLES LEISERSON: So that's what you use true garbage 1084 00:54:28,520 --> 00:54:30,520 collection for Yeah? 1085 00:54:30,520 --> 00:54:33,790 1086 00:54:33,790 --> 00:54:35,196 AUDIENCE: You could actually, there's techniques where you 1087 00:54:35,196 --> 00:54:38,680 can defer the work of incrementing it. 1088 00:54:38,680 --> 00:54:40,390 Like you basically put the increment and decrement 1089 00:54:40,390 --> 00:54:43,040 operations that you'd like to do into a buffer, and then you 1090 00:54:43,040 --> 00:54:44,100 can do them later. 1091 00:54:44,100 --> 00:54:46,230 But it's sort of the same. 1092 00:54:46,230 --> 00:54:47,610 CHARLES LEISERSON: Yeah, you could log. 1093 00:54:47,610 --> 00:54:50,350 So there are schemes where you log that, oh, I decremented 1094 00:54:50,350 --> 00:54:52,590 this, better check it later, type thing. 1095 00:54:52,590 --> 00:54:54,260 Or I decrement it to 0. 1096 00:54:54,260 --> 00:54:57,550 Let me just put it in there, and I'll go actually do the 1097 00:54:57,550 --> 00:55:00,460 stuff in batch mode at some later time. 1098 00:55:00,460 --> 00:55:01,150 Yep. 1099 00:55:01,150 --> 00:55:05,080 So programming is fun, because you can be clever with 1100 00:55:05,080 --> 00:55:07,020 schemes like that. 1101 00:55:07,020 --> 00:55:09,430 And that could be quite a reasonable thing to do, which 1102 00:55:09,430 --> 00:55:13,370 is wait until you're waiting on the user for input, and at 1103 00:55:13,370 --> 00:55:14,730 that point, collect your garbage. 1104 00:55:14,730 --> 00:55:17,970 Now, the thing you have to risk is, what happens if that 1105 00:55:17,970 --> 00:55:20,690 takes a long time to get there? 1106 00:55:20,690 --> 00:55:27,350 And then there's the complexity of your algorithms 1107 00:55:27,350 --> 00:55:28,060 and so forth. 1108 00:55:28,060 --> 00:55:31,260 So you get into a lot of issues of software maintenance 1109 00:55:31,260 --> 00:55:32,460 and so forth. 1110 00:55:32,460 --> 00:55:34,990 But that can be a very effective thing, is take it 1111 00:55:34,990 --> 00:55:37,220 out of the critical path of the actual complication you're 1112 00:55:37,220 --> 00:55:40,390 doing, if that's mission critical, and put it in some 1113 00:55:40,390 --> 00:55:42,490 part of the computation where it isn't mission critical. 1114 00:55:42,490 --> 00:55:43,740 Very good idea. 1115 00:55:43,740 --> 00:55:46,875 1116 00:55:46,875 --> 00:55:47,730 I'm sorry? 1117 00:55:47,730 --> 00:55:48,980 AUDIENCE: For thread safety. 1118 00:55:48,980 --> 00:55:51,890 1119 00:55:51,890 --> 00:55:56,120 Like if you have shared reference accounts, like 1120 00:55:56,120 --> 00:55:58,600 multiple threads updating the counts, you don't want to have 1121 00:55:58,600 --> 00:56:01,370 them all in the same count at the same time. 1122 00:56:01,370 --> 00:56:04,160 So they fill it like a thread local buffer, and-- 1123 00:56:04,160 --> 00:56:04,670 CHARLES LEISERSON: I see what you're saying. 1124 00:56:04,670 --> 00:56:05,230 Yes. 1125 00:56:05,230 --> 00:56:06,970 So we're going to get into that as we deal with 1126 00:56:06,970 --> 00:56:10,780 multi-threading and so forth, that if you've got two guys 1127 00:56:10,780 --> 00:56:12,490 that are operating on the same data structure, you have to be 1128 00:56:12,490 --> 00:56:15,180 very careful about the consistency, and that one 1129 00:56:15,180 --> 00:56:17,500 doesn't do something while the other is doing something else, 1130 00:56:17,500 --> 00:56:18,310 and so forth. 1131 00:56:18,310 --> 00:56:20,390 So for that reason, it can be a good 1132 00:56:20,390 --> 00:56:21,930 idea to postpone operation. 1133 00:56:21,930 --> 00:56:25,090 1134 00:56:25,090 --> 00:56:29,150 So general garbage collection is based on a graph 1135 00:56:29,150 --> 00:56:29,800 abstraction. 1136 00:56:29,800 --> 00:56:33,080 So let's move into the graph world. 1137 00:56:33,080 --> 00:56:37,040 So I can view objects and pointers as forming a directed 1138 00:56:37,040 --> 00:56:40,480 graph G equals VE, where V are the objects and E are the 1139 00:56:40,480 --> 00:56:42,650 pointers from one object to another. 1140 00:56:42,650 --> 00:56:45,310 1141 00:56:45,310 --> 00:56:48,560 The live objects are reachable from the roots. 1142 00:56:48,560 --> 00:56:53,080 And now the thing is that finding all of the live 1143 00:56:53,080 --> 00:56:57,270 objects is then like doing a graph search. 1144 00:56:57,270 --> 00:57:00,770 The common ways of doing a graph search are depth for 1145 00:57:00,770 --> 00:57:03,080 search and breadth-first search. 1146 00:57:03,080 --> 00:57:05,540 It turns out we're going to use breadth-first search for 1147 00:57:05,540 --> 00:57:08,660 reasons that I'll show you. 1148 00:57:08,660 --> 00:57:10,740 So the idea in breadth-first search, to 1149 00:57:10,740 --> 00:57:12,140 find all the objects. 1150 00:57:12,140 --> 00:57:14,740 So the idea is, what we're going to do is go through and 1151 00:57:14,740 --> 00:57:18,520 find, where are all the objects in my system that are 1152 00:57:18,520 --> 00:57:19,760 reachable from the root? 1153 00:57:19,760 --> 00:57:22,590 And then anything that I didn't find, ah. 1154 00:57:22,590 --> 00:57:27,540 That must be garbage. 1155 00:57:27,540 --> 00:57:28,480 So here's the idea. 1156 00:57:28,480 --> 00:57:34,600 What we're going to use is FIFO queue called Q. So that's 1157 00:57:34,600 --> 00:57:36,420 a first in, first out. 1158 00:57:36,420 --> 00:57:39,070 It has a head and a tail. 1159 00:57:39,070 --> 00:57:41,840 And the mechanism is very much like a stack. 1160 00:57:41,840 --> 00:57:45,110 Whenever I need to enqueue something, I put it on the 1161 00:57:45,110 --> 00:57:48,000 tail end and increment the tail pointer. 1162 00:57:48,000 --> 00:57:50,560 Whenever I need to take something off the queue, I 1163 00:57:50,560 --> 00:57:51,810 increment the head pointer. 1164 00:57:51,810 --> 00:57:54,290 1165 00:57:54,290 --> 00:57:59,800 So I'm always doing head and tail, pushing on the tail and 1166 00:57:59,800 --> 00:58:01,380 pulling off the head. 1167 00:58:01,380 --> 00:58:03,010 Enqueue and dequeue. 1168 00:58:03,010 --> 00:58:07,340 So the way that this works is, I basically go through all my 1169 00:58:07,340 --> 00:58:13,000 objects, and if they're a route, I put a mark on it. 1170 00:58:13,000 --> 00:58:15,120 And I enqueue-- 1171 00:58:15,120 --> 00:58:18,320 this is pseudocode, by the way, so don't hold me to my C 1172 00:58:18,320 --> 00:58:20,580 programming standard here. 1173 00:58:20,580 --> 00:58:26,600 And I market, and then I enqueue it on the queue Q. And 1174 00:58:26,600 --> 00:58:29,320 otherwise, I just simply mark it as zero. 1175 00:58:29,320 --> 00:58:33,030 So being marked means whether we've discovered it's 1176 00:58:33,030 --> 00:58:34,740 reachable from a route. 1177 00:58:34,740 --> 00:58:37,660 So to begin with, what we're going to do is mark everything 1178 00:58:37,660 --> 00:58:42,000 that's reachable from the routes, and we're going to 1179 00:58:42,000 --> 00:58:44,870 enqueue them on the FIFO. 1180 00:58:44,870 --> 00:58:46,320 And everything else, we're going to mark as 1181 00:58:46,320 --> 00:58:48,550 unreachable, for now. 1182 00:58:48,550 --> 00:58:51,450 And now what we're going to do is see which ones we discover. 1183 00:58:51,450 --> 00:58:56,930 So what we do is, while the queue is non-empty, we're 1184 00:58:56,930 --> 00:59:01,030 going to take something off of the queue and then look at all 1185 00:59:01,030 --> 00:59:02,520 of its neighbors. 1186 00:59:02,520 --> 00:59:06,890 So all the edges UV, look at all the things that go from U 1187 00:59:06,890 --> 00:59:11,440 to V And if it's not marked, then we're going to mark it 1188 00:59:11,440 --> 00:59:13,340 and put it on the queue. 1189 00:59:13,340 --> 00:59:15,800 That's it. 1190 00:59:15,800 --> 00:59:17,530 If it's already marked, we don't have to do anything with 1191 00:59:17,530 --> 00:59:19,960 it, but if it's unmarked, we'll mark it and do it. 1192 00:59:19,960 --> 00:59:23,910 And then in the end, everything that's marked is 1193 00:59:23,910 --> 00:59:29,040 going to be something that is live, and everything that is 1194 00:59:29,040 --> 00:59:32,110 unmarked is going to be things that are dead. 1195 00:59:32,110 --> 00:59:35,480 But it's going to be actually even sexier and more clever 1196 00:59:35,480 --> 00:59:37,930 than that, as you'll see. 1197 00:59:37,930 --> 00:59:40,520 So here's an example. 1198 00:59:40,520 --> 00:59:43,030 We start out with my queue being empty, with head and 1199 00:59:43,030 --> 00:59:44,570 tail pointing to the same place. 1200 00:59:44,570 --> 00:59:47,000 And what we do, is we go through and we first-- in this 1201 00:59:47,000 --> 00:59:50,360 case, I have just one route R, right here. 1202 00:59:50,360 --> 00:59:53,780 So we basically mark R and put it on, and 1203 00:59:53,780 --> 00:59:54,790 that's how we get started. 1204 00:59:54,790 --> 00:59:57,640 And everything else is unmarked up here. 1205 00:59:57,640 --> 00:59:59,720 So I'm going to mark with green up here, and I'm going 1206 00:59:59,720 --> 01:00:03,380 to show what's on the queue with the green down here, and 1207 01:00:03,380 --> 01:00:06,510 dark green when I pop something off. 1208 01:00:06,510 --> 01:00:08,920 So the first thing I do is I say, OK. 1209 01:00:08,920 --> 01:00:10,750 Let me dequeue something. 1210 01:00:10,750 --> 01:00:13,900 And in this case, I'm dequeueing R, because it's the 1211 01:00:13,900 --> 01:00:15,180 only thing on there. 1212 01:00:15,180 --> 01:00:17,730 And what I do now is I visit all the neighbors. 1213 01:00:17,730 --> 01:00:22,340 So I visit b, so I put that on, mark it and put that on, 1214 01:00:22,340 --> 01:00:25,200 and then I visit c, so I visit and put that on. 1215 01:00:25,200 --> 01:00:28,330 1216 01:00:28,330 --> 01:00:32,500 Then I'm done with R. I visited all its neighbors. 1217 01:00:32,500 --> 01:00:37,390 So then I go and I dequeue B, and now 1218 01:00:37,390 --> 01:00:38,660 visit all its neighbors. 1219 01:00:38,660 --> 01:00:41,090 Well, in [? visit ?] c, there's nothing to be done 1220 01:00:41,090 --> 01:00:43,230 there, because c is already marked. 1221 01:00:43,230 --> 01:00:44,520 So I don't enqueue it again. 1222 01:00:44,520 --> 01:00:47,460 I already have observed that. 1223 01:00:47,460 --> 01:00:55,500 And there's nothing else adjacent to b, so basically, I 1224 01:00:55,500 --> 01:00:59,510 then go on and dequeue c. 1225 01:00:59,510 --> 01:01:02,380 So I have c here, and now we go visit its neighbours, and 1226 01:01:02,380 --> 01:01:04,170 the neighbors are d and e. 1227 01:01:04,170 --> 01:01:11,980 So we enqueue d, enqueue e, and now we are done dequeuing 1228 01:01:11,980 --> 01:01:16,120 all his neighbors and marking them, so now we pop off d. 1229 01:01:16,120 --> 01:01:19,510 And now d has no neighbors, so there's nothing to be done. 1230 01:01:19,510 --> 01:01:23,990 And now I dequeue e, and basically, e has one neighbor 1231 01:01:23,990 --> 01:01:27,640 f, so we mark f. 1232 01:01:27,640 --> 01:01:34,540 Then we dequeue f, we mark g, then we dequeue g, and g has-- 1233 01:01:34,540 --> 01:01:36,660 the only neighbor is e. 1234 01:01:36,660 --> 01:01:43,330 And now my queue is empty, and so I'm done. 1235 01:01:43,330 --> 01:01:44,690 And now you'll see, what did I do? 1236 01:01:44,690 --> 01:01:48,010 I marked all the things that were reachable from r, and 1237 01:01:48,010 --> 01:01:51,470 everything else is garbage. 1238 01:01:51,470 --> 01:01:54,090 Now, that's only half the story. 1239 01:01:54,090 --> 01:01:57,720 So I've marked them, but more relevantly, it turns out for 1240 01:01:57,720 --> 01:02:01,190 garbage collection, is look at what I've got in my queue. 1241 01:02:01,190 --> 01:02:04,320 What do I have in my queue? 1242 01:02:04,320 --> 01:02:11,100 I've got all the live objects, got put in the queue as we 1243 01:02:11,100 --> 01:02:13,060 were going along. 1244 01:02:13,060 --> 01:02:14,940 All the live objects are put in the queue. 1245 01:02:14,940 --> 01:02:18,040 And not only that, but notice they're all adjacent to each 1246 01:02:18,040 --> 01:02:19,440 other in the queue. 1247 01:02:19,440 --> 01:02:22,222 They're compact in the queue. 1248 01:02:22,222 --> 01:02:25,070 1249 01:02:25,070 --> 01:02:27,050 And so the idea is, we're going to take advantage of 1250 01:02:27,050 --> 01:02:27,920 that property. 1251 01:02:27,920 --> 01:02:30,440 That when I do breadth-first search, that the queue, when 1252 01:02:30,440 --> 01:02:34,630 I'm done, I put every object in there. 1253 01:02:34,630 --> 01:02:36,430 So all the live vertices are placed in 1254 01:02:36,430 --> 01:02:37,960 contiguous storage in queue. 1255 01:02:37,960 --> 01:02:40,580 1256 01:02:40,580 --> 01:02:43,680 So this is the principle behind the copying garbage 1257 01:02:43,680 --> 01:02:47,780 collector, which is one of the more popular garbage 1258 01:02:47,780 --> 01:02:49,540 collectors that's used. 1259 01:02:49,540 --> 01:02:53,860 The idea is that as I'm executing, I'm going to have 1260 01:02:53,860 --> 01:03:01,630 some live objects and some dead objects, and I'm going to 1261 01:03:01,630 --> 01:03:05,830 have some place that I'm doing my next allocation at. 1262 01:03:05,830 --> 01:03:11,410 And as I go through, I allocate more things in what's 1263 01:03:11,410 --> 01:03:15,610 called the from space, and I keep allocating, and I may do 1264 01:03:15,610 --> 01:03:19,590 a deallocation, and maybe some more allocation, and maybe 1265 01:03:19,590 --> 01:03:23,030 another deallocation, and some more allocation, and 1266 01:03:23,030 --> 01:03:32,140 eventually I run out of my memory that I've assigned for 1267 01:03:32,140 --> 01:03:34,000 my heap storage. 1268 01:03:34,000 --> 01:03:37,440 So when I get to this point, now what I do is I'm going to 1269 01:03:37,440 --> 01:03:43,120 do breadth-first search on the from space. 1270 01:03:43,120 --> 01:03:47,190 And what I'm going to do is I'm going to copy using the 1271 01:03:47,190 --> 01:03:49,670 live storage here, using [UNINTELLIGIBLE] to a TO 1272 01:03:49,670 --> 01:03:56,600 space, where the TO space implements the FIFO queue. 1273 01:03:56,600 --> 01:03:59,540 So here's the TO space, and when I go through and I walk 1274 01:03:59,540 --> 01:04:03,250 all that, I've now copied all of the values, all of the 1275 01:04:03,250 --> 01:04:05,690 objects down to this space over here. 1276 01:04:05,690 --> 01:04:08,390 And now I have this amount of storage left. 1277 01:04:08,390 --> 01:04:10,590 I make the two space exactly the same 1278 01:04:10,590 --> 01:04:12,130 size as the from space. 1279 01:04:12,130 --> 01:04:16,000 I have to make it at least that big, because I know that 1280 01:04:16,000 --> 01:04:19,200 it may be that everything here is alive. 1281 01:04:19,200 --> 01:04:20,660 Maybe nothing got deallocated. 1282 01:04:20,660 --> 01:04:23,490 So I have to make sure that the TO space is large enough 1283 01:04:23,490 --> 01:04:26,340 to handle everything that might be alive there, and it 1284 01:04:26,340 --> 01:04:27,960 may be everything. 1285 01:04:27,960 --> 01:04:30,700 So the TO space is generally allocated to be exactly the 1286 01:04:30,700 --> 01:04:32,025 same size as the FROM space. 1287 01:04:32,025 --> 01:04:34,750 1288 01:04:34,750 --> 01:04:38,140 So basically, the way I compacted is I copied things 1289 01:04:38,140 --> 01:04:40,050 down using the breadth-first search algorithm. 1290 01:04:40,050 --> 01:04:42,700 1291 01:04:42,700 --> 01:04:46,650 Now there's one problem with this, and that is that we have 1292 01:04:46,650 --> 01:04:50,630 pointers in these objects. 1293 01:04:50,630 --> 01:04:54,120 And when I copy it down here, how did I make sure that all 1294 01:04:54,120 --> 01:04:57,050 the pointers now point to the new locations? 1295 01:04:57,050 --> 01:05:00,010 So here's how you do that. 1296 01:05:00,010 --> 01:05:03,740 So since the FROM address of the object is not generally 1297 01:05:03,740 --> 01:05:06,980 equal to the TO address, pointers must be updated. 1298 01:05:06,980 --> 01:05:09,340 So I have to know, of course, where the pointers are. 1299 01:05:09,340 --> 01:05:12,100 So the idea is that when an object is copied to the TO 1300 01:05:12,100 --> 01:05:15,920 space, what we're going to do in FROM space is store a 1301 01:05:15,920 --> 01:05:19,010 forwarding pointer in FROM space saying, 1302 01:05:19,010 --> 01:05:20,260 I copy you to here. 1303 01:05:20,260 --> 01:05:22,800 1304 01:05:22,800 --> 01:05:25,540 So that whenever I look at something in the FROM space 1305 01:05:25,540 --> 01:05:29,490 that's already been copied, I can say, oh, here's where it's 1306 01:05:29,490 --> 01:05:30,805 been copied to. 1307 01:05:30,805 --> 01:05:35,060 It's been copied to this region in the TO space. 1308 01:05:35,060 --> 01:05:38,890 When an object is removed from the FIFO queue in the TO 1309 01:05:38,890 --> 01:05:41,685 space, we're going to update all its pointers. 1310 01:05:41,685 --> 01:05:43,105 I'm going to give you an example of this. 1311 01:05:43,105 --> 01:05:47,580 1312 01:05:47,580 --> 01:05:50,380 So if you think about it, when I remove something from the 1313 01:05:50,380 --> 01:05:56,120 FIFO queue, at that point, I know all of its adjacent 1314 01:05:56,120 --> 01:06:03,790 vertices have been placed into the queue, into the TO space. 1315 01:06:03,790 --> 01:06:06,570 And so at that point, I know where all the final 1316 01:06:06,570 --> 01:06:09,000 destinations of pointers are. 1317 01:06:09,000 --> 01:06:12,390 So here's an example. 1318 01:06:12,390 --> 01:06:16,930 So suppose that I'm doing the copy of these guys, and I've 1319 01:06:16,930 --> 01:06:19,770 got all the pointers around here, and now some of these 1320 01:06:19,770 --> 01:06:23,420 guys have already made it into the FIFO queue. 1321 01:06:23,420 --> 01:06:25,400 Let's say this guy has made it in, because I've got 1322 01:06:25,400 --> 01:06:28,120 forwarding pointer to there, and this guy has made it in, 1323 01:06:28,120 --> 01:06:30,080 so I've got forwarding pointer to there. 1324 01:06:30,080 --> 01:06:33,610 And this is what my queue currently holds. 1325 01:06:33,610 --> 01:06:37,490 So when I remove an item from the queue, I then look at this 1326 01:06:37,490 --> 01:06:40,960 guy, and what I do is first of all, I visit all of the 1327 01:06:40,960 --> 01:06:42,440 adjacent neighbors. 1328 01:06:42,440 --> 01:06:45,340 So I visit this guy and I say, oh, he's already been copied, 1329 01:06:45,340 --> 01:06:46,910 because I have a forwarding pointer. 1330 01:06:46,910 --> 01:06:48,540 So nothing to be done there. 1331 01:06:48,540 --> 01:06:51,640 But this guy, he hasn't been copied yet. 1332 01:06:51,640 --> 01:06:54,660 So I have to make sure that that guy is copied by 1333 01:06:54,660 --> 01:06:56,270 enqueuing the adjacent vertices. 1334 01:06:56,270 --> 01:07:02,220 So I copy my neighbor to there. 1335 01:07:02,220 --> 01:07:06,350 And part of copying is, if I'm actually making a copy, I'm 1336 01:07:06,350 --> 01:07:08,630 going to have a pointer from here. 1337 01:07:08,630 --> 01:07:13,590 The pointer now has to point to this guy over here, because 1338 01:07:13,590 --> 01:07:15,140 that's how I'm copying. 1339 01:07:15,140 --> 01:07:17,260 Otherwise it isn't a copy. 1340 01:07:17,260 --> 01:07:20,780 So in the copy, this pointer is the same as that one, is 1341 01:07:20,780 --> 01:07:23,050 pointing to the same storage location, namely 1342 01:07:23,050 --> 01:07:25,520 this place over here. 1343 01:07:25,520 --> 01:07:29,170 So I do that for all my adjacent vertices. 1344 01:07:29,170 --> 01:07:31,660 People follow so far? 1345 01:07:31,660 --> 01:07:35,760 Now I've copied all of the neighbors for this guy. 1346 01:07:35,760 --> 01:07:40,210 Now this guy, all of his neighbors are in their final 1347 01:07:40,210 --> 01:07:43,070 destinations in the TO space. 1348 01:07:43,070 --> 01:07:46,520 So that means I can now update these pointers to be the 1349 01:07:46,520 --> 01:07:48,310 pointers in the TO space. 1350 01:07:48,310 --> 01:07:51,590 So I do that as follows. 1351 01:07:51,590 --> 01:07:54,310 Oh, first of all, I forgot to place the forwarding pointer 1352 01:07:54,310 --> 01:07:56,930 here to indicate that he's been copied. 1353 01:07:56,930 --> 01:07:59,200 That's how I'm doing the marks. 1354 01:07:59,200 --> 01:08:00,540 Sorry about that. 1355 01:08:00,540 --> 01:08:03,690 So I copied the values, and that's just a straight copy, 1356 01:08:03,690 --> 01:08:05,380 because all the pointer values are now 1357 01:08:05,380 --> 01:08:08,130 pointing into this space. 1358 01:08:08,130 --> 01:08:10,170 Then I put a forwarding pointer in. 1359 01:08:10,170 --> 01:08:15,040 Now I'm ready to do the update of the pointers here. 1360 01:08:15,040 --> 01:08:17,819 So I update the pointers to the removed item. 1361 01:08:17,819 --> 01:08:19,000 So here's one pointer. 1362 01:08:19,000 --> 01:08:21,140 I get rid of him there and I make him point to the 1363 01:08:21,140 --> 01:08:21,770 forwarding thing. 1364 01:08:21,770 --> 01:08:25,100 He used to point to here, now he's pointing to here. 1365 01:08:25,100 --> 01:08:27,569 This one used to point to here, I want to make 1366 01:08:27,569 --> 01:08:28,819 him point to there. 1367 01:08:28,819 --> 01:08:33,399 1368 01:08:33,399 --> 01:08:36,220 There we go. 1369 01:08:36,220 --> 01:08:37,350 So now he's pointing here. 1370 01:08:37,350 --> 01:08:40,979 And now this guy here now has his pointers to his final 1371 01:08:40,979 --> 01:08:43,270 destinations. 1372 01:08:43,270 --> 01:08:45,899 And so the invariance that I'm maintaining is that everything 1373 01:08:45,899 --> 01:08:52,630 before the head of the queue all have their final pointers 1374 01:08:52,630 --> 01:08:54,890 in place in the TO space. 1375 01:08:54,890 --> 01:08:56,359 Question? 1376 01:08:56,359 --> 01:08:59,119 AUDIENCE: So do you know where to put the pointers-- 1377 01:08:59,119 --> 01:09:02,734 how do [UNINTELLIGIBLE] pointers go from the TO space 1378 01:09:02,734 --> 01:09:04,883 to the TO space so that they're actually getting to 1379 01:09:04,883 --> 01:09:05,545 their final destination? 1380 01:09:05,545 --> 01:09:08,025 So my question is, [UNINTELLIGIBLE] by following 1381 01:09:08,025 --> 01:09:09,275 [UNINTELLIGIBLE]. 1382 01:09:09,275 --> 01:09:13,000 1383 01:09:13,000 --> 01:09:15,699 But how do you know what pointer to follow? 1384 01:09:15,699 --> 01:09:17,539 How do you know that [INAUDIBLE]? 1385 01:09:17,539 --> 01:09:19,380 CHARLES LEISERSON: If I have a pointer to something in the 1386 01:09:19,380 --> 01:09:23,870 FROM space, then I update it to be the forwarding pointer 1387 01:09:23,870 --> 01:09:25,196 in the TO space. 1388 01:09:25,196 --> 01:09:27,015 AUDIENCE: Yeah, the question is, how do you know that 1389 01:09:27,015 --> 01:09:27,729 [INAUDIBLE] 1390 01:09:27,729 --> 01:09:28,859 it's being pointed to [? the ?] 1391 01:09:28,859 --> 01:09:31,607 actually in the TO space? 1392 01:09:31,607 --> 01:09:33,439 Can you check, is this in the TO space? 1393 01:09:33,439 --> 01:09:33,939 CHARLES LEISERSON: Well, yeah. 1394 01:09:33,939 --> 01:09:35,990 I mean, you know what the range of the two spaces is, so 1395 01:09:35,990 --> 01:09:37,240 you could check that. 1396 01:09:37,240 --> 01:09:39,330 1397 01:09:39,330 --> 01:09:39,510 OK. 1398 01:09:39,510 --> 01:09:40,140 Question? 1399 01:09:40,140 --> 01:09:42,322 AUDIENCE: So the reason you would be using [? BFS ?] 1400 01:09:42,322 --> 01:09:44,505 instead of [? DFS ?] is because [INAUDIBLE]. 1401 01:09:44,505 --> 01:09:47,410 1402 01:09:47,410 --> 01:09:49,189 CHARLES LEISERSON: The reason you're using BFS is because it 1403 01:09:49,189 --> 01:09:51,910 has this nice property that the queue represents the 1404 01:09:51,910 --> 01:09:53,390 copying of all the values. 1405 01:09:53,390 --> 01:09:56,600 If I use DFS, I need a stack to do it, and then I'm 1406 01:09:56,600 --> 01:10:01,710 overwriting the locations where the vertices would be 1407 01:10:01,710 --> 01:10:04,280 stored in the data structure. 1408 01:10:04,280 --> 01:10:08,240 By using DFS, there's this just clever thing that the 1409 01:10:08,240 --> 01:10:11,010 queue, when I'm done with DFS, represents all 1410 01:10:11,010 --> 01:10:13,550 of the visited vertices. 1411 01:10:13,550 --> 01:10:15,620 It's very clever. 1412 01:10:15,620 --> 01:10:17,230 Wish I'd thought of that. 1413 01:10:17,230 --> 01:10:19,520 But I was, I think, only 10 years old or something when 1414 01:10:19,520 --> 01:10:23,475 they invented this, and I wasn't that precocious. 1415 01:10:23,475 --> 01:10:26,390 1416 01:10:26,390 --> 01:10:28,570 OK? 1417 01:10:28,570 --> 01:10:30,560 So very, very clever. 1418 01:10:30,560 --> 01:10:35,340 So this is the scheme behind the Minsky, Fenichel, 1419 01:10:35,340 --> 01:10:37,120 Yochelson garbage collector. 1420 01:10:37,120 --> 01:10:40,160 So Marvin Minsky, you may have heard of, was one of the 1421 01:10:40,160 --> 01:10:44,020 inventors of a scheme like this. 1422 01:10:44,020 --> 01:10:45,270 One reason he's so famous. 1423 01:10:45,270 --> 01:10:48,020 1424 01:10:48,020 --> 01:10:49,020 Good. 1425 01:10:49,020 --> 01:10:51,670 So this is really nice, because it takes me linear 1426 01:10:51,670 --> 01:10:55,990 time to copy and update all of the vertices. 1427 01:10:55,990 --> 01:10:57,800 It's just one pass of breadth-first search. 1428 01:10:57,800 --> 01:11:00,990 Very fast to copy and update all the vertices. 1429 01:11:00,990 --> 01:11:02,840 Sometimes, by the way, people call these forwarding 1430 01:11:02,840 --> 01:11:04,760 pointers, they call them broken hearts. 1431 01:11:04,760 --> 01:11:06,220 You'll see that in the literature. 1432 01:11:06,220 --> 01:11:09,400 They call them a broken heart if it's-- 1433 01:11:09,400 --> 01:11:13,750 I don't quite remember why that's the-- 1434 01:11:13,750 --> 01:11:15,580 it's like, I went there, oh, you're not 1435 01:11:15,580 --> 01:11:16,800 there, it's over there-- 1436 01:11:16,800 --> 01:11:18,050 it's like, I don't know. 1437 01:11:18,050 --> 01:11:20,400 1438 01:11:20,400 --> 01:11:23,980 Now, the last thing you have to do-- and this is something 1439 01:11:23,980 --> 01:11:27,370 that's not talked about a lot in the literature-- 1440 01:11:27,370 --> 01:11:29,460 is how do you manage these from and to 1441 01:11:29,460 --> 01:11:33,830 spaces in virtual memory? 1442 01:11:33,830 --> 01:11:37,440 So here's the strategy that works well. 1443 01:11:37,440 --> 01:11:41,660 So basically, typically what I'm doing is after I've done a 1444 01:11:41,660 --> 01:11:45,130 copy, I've emptied everything out of the FROM space, and 1445 01:11:45,130 --> 01:11:49,590 I've got a portion that's used in the TO space, and then an 1446 01:11:49,590 --> 01:11:52,500 unused portion that I'm going to be able to do allocations 1447 01:11:52,500 --> 01:11:55,900 out of for the next pass around. 1448 01:11:55,900 --> 01:11:59,580 So what I do at that point is I make the old TO 1449 01:11:59,580 --> 01:12:01,260 become the new FROM. 1450 01:12:01,260 --> 01:12:02,980 So here's the new FROM. 1451 01:12:02,980 --> 01:12:04,740 This is available space. 1452 01:12:04,740 --> 01:12:08,940 And now what I do is I extend this space, if necessary, to 1453 01:12:08,940 --> 01:12:12,230 make the new heap space that I'm going to allocate out of 1454 01:12:12,230 --> 01:12:16,130 equal in size to the used space. 1455 01:12:16,130 --> 01:12:18,980 And the reason for doing this is to amortize the cost of 1456 01:12:18,980 --> 01:12:22,640 going through all of the storage, so that the cost of 1457 01:12:22,640 --> 01:12:25,660 going the garbage question can be amortized 1458 01:12:25,660 --> 01:12:26,680 against what's used. 1459 01:12:26,680 --> 01:12:29,230 If I didn't extend this, I might have only freed up a 1460 01:12:29,230 --> 01:12:33,170 tiny little bit of space, and now when I run out of that 1461 01:12:33,170 --> 01:12:35,660 space, I'll run the garbage collector going over all the 1462 01:12:35,660 --> 01:12:39,630 elements, but I'll only have gotten a few elements 1463 01:12:39,630 --> 01:12:39,800 [INAUDIBLE]. 1464 01:12:39,800 --> 01:12:42,070 So what I want to do is make sure that I've done at least a 1465 01:12:42,070 --> 01:12:45,300 certain number of allocations proportional to the number 1466 01:12:45,300 --> 01:12:48,090 that I'm going to have to walk over in order to finish it. 1467 01:12:48,090 --> 01:12:50,630 So that's why you make this be the same size. 1468 01:12:50,630 --> 01:12:54,110 Now when the space runs out, the new TO is allocated with 1469 01:12:54,110 --> 01:12:57,580 the same size as FROM. 1470 01:12:57,580 --> 01:13:03,360 So if you can allocate it back in the old FROM space here, 1471 01:13:03,360 --> 01:13:04,720 that's great. 1472 01:13:04,720 --> 01:13:05,950 If there's room here to allocate it 1473 01:13:05,950 --> 01:13:07,340 at the front, super. 1474 01:13:07,340 --> 01:13:08,600 That's where you allocate it. 1475 01:13:08,600 --> 01:13:12,500 But as in this example, the FROM space here is bigger than 1476 01:13:12,500 --> 01:13:14,770 the space that goes up to here. 1477 01:13:14,770 --> 01:13:19,560 So therefore what I do is I allocate it at the end, and 1478 01:13:19,560 --> 01:13:22,600 then I go through and copy to here. 1479 01:13:22,600 --> 01:13:25,620 Now if this ended up being used, then the next time 1480 01:13:25,620 --> 01:13:30,400 around, I would be able to reuse this space at the front 1481 01:13:30,400 --> 01:13:33,440 for the next time that I flip the garbage collection. 1482 01:13:33,440 --> 01:13:38,220 The idea is, if things will fit as early in memory as 1483 01:13:38,220 --> 01:13:39,470 possible, that's where I put them. 1484 01:13:39,470 --> 01:13:41,760 But if they don't fit, I put them later. 1485 01:13:41,760 --> 01:13:44,420 And you can prove, if you do this, that the virtual memory 1486 01:13:44,420 --> 01:13:48,380 space used is order 1 times the optimal, where you're 1487 01:13:48,380 --> 01:13:51,330 getting nearly the best virtual memory space used 1488 01:13:51,330 --> 01:13:56,050 compared to the amount of items that are in use by the 1489 01:13:56,050 --> 01:13:58,460 user at any time. 1490 01:13:58,460 --> 01:14:00,880 So it keeps everything sort of compact towards 1491 01:14:00,880 --> 01:14:02,310 the front end here. 1492 01:14:02,310 --> 01:14:04,360 So in this case, I had to allocate to here. 1493 01:14:04,360 --> 01:14:07,410 If the use had been very small, I could have allocated 1494 01:14:07,410 --> 01:14:09,630 just a very small amount for the heap. 1495 01:14:09,630 --> 01:14:12,490 When I was done, the new TO space would 1496 01:14:12,490 --> 01:14:13,550 have fit at the beginning. 1497 01:14:13,550 --> 01:14:16,380 I would have allocated it at the beginning. 1498 01:14:16,380 --> 01:14:19,550 And it's a nice little exercise, for those of you who 1499 01:14:19,550 --> 01:14:23,050 are analytically inclined, to figure out what the constant 1500 01:14:23,050 --> 01:14:27,570 is here, because it's a little bit of a game to figure out 1501 01:14:27,570 --> 01:14:30,730 what the amortized constant, for those who have some 1502 01:14:30,730 --> 01:14:37,700 algorithmic energy and our confident about working in 1503 01:14:37,700 --> 01:14:39,365 this kind of thing instead of studying for the quiz. 1504 01:14:39,365 --> 01:14:43,460 1505 01:14:43,460 --> 01:14:45,880 So dynamic storage allocation. 1506 01:14:45,880 --> 01:14:48,960 Hugely interesting area. 1507 01:14:48,960 --> 01:14:50,930 Lots is known and lots is unknown. 1508 01:14:50,930 --> 01:14:52,430 For example, I mentioned coalescing. 1509 01:14:52,430 --> 01:14:54,460 We really don't understand coalescing. 1510 01:14:54,460 --> 01:14:57,680 Disaster people have been doing it for 50 years or more. 1511 01:14:57,680 --> 01:15:01,490 We do not understand coalescing. 1512 01:15:01,490 --> 01:15:04,190 There are a lot of strategies that people have studied, 1513 01:15:04,190 --> 01:15:06,300 things like the buddy system, I mentioned. 1514 01:15:06,300 --> 01:15:09,050 There are other types of garbage collectors called mark 1515 01:15:09,050 --> 01:15:10,820 and sweep garbage collection. 1516 01:15:10,820 --> 01:15:13,500 Those are typically inferior to the copying garbage 1517 01:15:13,500 --> 01:15:19,070 collection, because they leave you with a 1518 01:15:19,070 --> 01:15:21,980 memory that isn't compact. 1519 01:15:21,980 --> 01:15:24,390 And so you end up using more and more of your virtual 1520 01:15:24,390 --> 01:15:27,560 memory, because you're leaving things in place. 1521 01:15:27,560 --> 01:15:29,840 On the other hand, you leave things in place so there's no 1522 01:15:29,840 --> 01:15:32,100 need to move pointers. 1523 01:15:32,100 --> 01:15:33,970 So that can be a valuable thing. 1524 01:15:33,970 --> 01:15:35,620 There's things called generational garbage 1525 01:15:35,620 --> 01:15:36,650 collection. 1526 01:15:36,650 --> 01:15:39,660 Generational garbage collectors realize that things 1527 01:15:39,660 --> 01:15:43,930 that are allocated typically are freed very soon 1528 01:15:43,930 --> 01:15:48,660 thereafter, and so therefore, if you especially take care of 1529 01:15:48,660 --> 01:15:51,080 the things that are allocated and guess that they're going 1530 01:15:51,080 --> 01:15:55,640 to be used again, and that things that haven't been 1531 01:15:55,640 --> 01:15:58,980 deallocated for a long time are very unlikely to be 1532 01:15:58,980 --> 01:16:02,540 allocated, you can get better performance out 1533 01:16:02,540 --> 01:16:04,580 of a garbage collector. 1534 01:16:04,580 --> 01:16:06,730 There's real time garbage collection. 1535 01:16:06,730 --> 01:16:08,810 So this is what I've explained, is what's called a 1536 01:16:08,810 --> 01:16:11,500 stop and copy garbage collector. 1537 01:16:11,500 --> 01:16:13,560 Meaning that when you run out of storage, oops! 1538 01:16:13,560 --> 01:16:15,430 Stop the world, run the garbage collector. 1539 01:16:15,430 --> 01:16:17,880 There are real time garbage collectors which, whenever you 1540 01:16:17,880 --> 01:16:21,980 do an allocation, it goes and it does a few cycles of 1541 01:16:21,980 --> 01:16:26,680 garbage collection, and is such that you can always keep 1542 01:16:26,680 --> 01:16:30,060 up with the allocations and deallocations going on. 1543 01:16:30,060 --> 01:16:34,070 By doing a few cycles, you eventually get to garbage 1544 01:16:34,070 --> 01:16:37,960 collect everything, but you do it intermixed, so that you're 1545 01:16:37,960 --> 01:16:40,140 slowing everything down by a constant amount, rather than 1546 01:16:40,140 --> 01:16:42,990 running and suddenly, OK, your robot arm is going like this, 1547 01:16:42,990 --> 01:16:45,276 oops, garbage collect, garbage collect, garbage collect, 1548 01:16:45,276 --> 01:16:46,850 [CREAKING SOUND]. 1549 01:16:46,850 --> 01:16:50,160 So you can have your robot arm move smoothly, because it's 1550 01:16:50,160 --> 01:16:52,080 paying the price all the way. 1551 01:16:52,080 --> 01:16:54,430 It's not stopping and going. 1552 01:16:54,430 --> 01:16:57,570 There is multi-threaded storage allocation. 1553 01:16:57,570 --> 01:17:04,390 What happens when you have a multi-threaded environment 1554 01:17:04,390 --> 01:17:07,240 where you have many processors that are all allocating out of 1555 01:17:07,240 --> 01:17:08,040 the same pool? 1556 01:17:08,040 --> 01:17:09,790 How do they organize it? 1557 01:17:09,790 --> 01:17:11,920 You can have the interesting things going there where 1558 01:17:11,920 --> 01:17:15,990 things are allocated on one processor, and then get 1559 01:17:15,990 --> 01:17:17,800 deallocated on another. 1560 01:17:17,800 --> 01:17:20,050 And if you're not careful, you have something like a memory 1561 01:17:20,050 --> 01:17:24,420 leak going on, which is called memory drift, where the memory 1562 01:17:24,420 --> 01:17:27,660 gets allocated on one, and one guy's busy creating it, and it 1563 01:17:27,660 --> 01:17:30,360 keeps getting deallocated on the other, but never gets 1564 01:17:30,360 --> 01:17:33,210 recycled back to the guy who needs it. 1565 01:17:33,210 --> 01:17:36,280 And so how is it that you figure out to make sure that 1566 01:17:36,280 --> 01:17:39,590 memory drift doesn't end up having you grow your memory? 1567 01:17:39,590 --> 01:17:42,580 In fact, many, many garbage collectors that are out there 1568 01:17:42,580 --> 01:17:45,910 have this memory drift problem for multi-threaded storage 1569 01:17:45,910 --> 01:17:46,710 allocation. 1570 01:17:46,710 --> 01:17:48,740 There's also parallel garbage collection. 1571 01:17:48,740 --> 01:17:51,480 How do you, in a parallel environment, have the garbage 1572 01:17:51,480 --> 01:17:54,470 collector going on as a separate process while you 1573 01:17:54,470 --> 01:17:56,670 have other things going on at the same time? 1574 01:17:56,670 --> 01:17:59,960 So a huge, huge, huge area of storage. 1575 01:17:59,960 --> 01:18:05,240 And for our next lab, lab 3, you'll be building your own 1576 01:18:05,240 --> 01:18:06,100 storage allocator. 1577 01:18:06,100 --> 01:18:08,880 So you have a lot of fun with this. 1578 01:18:08,880 --> 01:18:10,580 [? Reid ?] says this is his favorite lab. 1579 01:18:10,580 --> 01:18:13,560 1580 01:18:13,560 --> 01:18:13,820 Great. 1581 01:18:13,820 --> 01:18:15,070 Any questions? 1582 01:18:15,070 --> 01:18:17,560 1583 01:18:17,560 --> 01:18:17,910 OK. 1584 01:18:17,910 --> 01:18:19,300 Good luck on the quiz next time. 1585 01:18:19,300 --> 01:18:20,885